Continuous backup using a mirror device

ABSTRACT

Providing continuous backup of a storage device includes subdividing the storage device into subsections, providing a mirror device of the storage device that contains a copy of data that is on the storage device when the continuous backup is initiated, providing a time indicator that is modified periodically, and, in response to a request to write new data to a particular subsection of the storage device at a particular time, maintaining data being overwritten by the new data according to the particular subsection and according to a value of the indicator at the particular time, where, for a first write after the continuous backup is initiated, data from the mirror device is used to maintain data being overwritten. The subsections may be tracks. Maintaining the data being overwritten may include constructing a linked list of portions of data for each of the subsections. The portions of data may have variable sizes.

BACKGROUND OF THE INVENTION

1. Technical Field

This application relates to computer storage devices, and moreparticularly to the field of selectively maintaining and modifyingportions of data stored on a computer storage device and correspondingto particular points in time.

2. Description of Related Art

Host processor systems may store and retrieve data using a storagedevice containing a plurality of host interface units (host adapters),disk drives, and disk interface units (disk adapters). Such storagedevices are provided, for example, by EMC Corporation of Hopkinton,Mass. and disclosed in U.S. Pat. No. 5,206,939 to Yanai et al., U.S.Pat. No. 5,778,394 to Galtzur et al., U.S. Pat. No. 5,845,147 toVishlitzky et al., and U.S. Pat. No. 5,857,208 to Ofek. The host systemsaccess the storage device through a plurality of channels providedtherewith. Host systems provide data and access control informationthrough the channels of the storage device and the storage deviceprovides data to the host systems also through the channels. The hostsystems do not address the disk drives of the storage device directly,but rather, access what appears to the host systems as a plurality oflogical volumes. The logical volumes may or may nor correspond to theactual disk drives.

Data backup services may be used to protect against data loss. Suchservices may be performed periodically (e.g., once or twice a day). Whendata on the main system is lost, it may be recovered from the backupmedia.

There are a number of drawbacks to such data backup services, includingthe fact that it is only possible to recover data that corresponds todata that was saved at a periodic backup. For example, if data is backedup at 9:00 a.m. and 3:00 p.m. daily, then a user is not able to recoverdata from, say, 11:00 a.m. If a user desires the 11:00 a.m. version ofthe data, the best he or she can do is obtain a copy of the backed up9:00 a.m. version of the data and then perform steps to construct the11:00 a.m. version of the data (e.g., by manually reconstructing thedata).

One solution to this problem is to perform backups more frequently.However, increasing the frequency of backups increases the storagerequirements for backup data and increases the overhead and complexityof the backup data. Ideally, it is desirable to allow obtaining datafrom any previous time by having a system with continuous or nearcontinuous backups that does not have the increased storage requirementsor complexity associated with increasing the frequency of backups.

SUMMARY OF THE INVENTION

According to the present invention, providing continuous backup of astorage device, includes subdividing the storage device intosubsections, providing a time indicator that is modified periodically,and, in response to a request to write new data to a particularsubsection of the storage device at a particular time, maintaining databeing overwritten by the new data according to the particular subsectionand according to a value of the indicator at the particular time. Thesubsections may be tracks. Maintaining the data being overwritten mayinclude constructing a linked list of portions of data for each of thesubsections. The portions of data may have variable sizes. In responseto two data write operations to a particular subsection at a particularvalue of the indicator, data being written for each of the two datawrite operations may be combined if data for the second data writeoperation is a subset of data for the first data write operation.Providing continuous backup of a storage device may also includerestoring the storage device to a state thereof at a particular point intime by writing the maintained data to the storage device. Writing themaintained data to the storage device may include constructingsubsections of the data by combining separate portions thereofcorresponding to the same subsection. Providing continuous backup of astorage device may also include inserting data for a particularsubsection at a particular point in time by traversing datacorresponding to the particular subsection to obtain an appropriateinsertion point. Providing continuous backup of a storage device mayalso include reading data for a particular subsection at a particularpoint in time by traversing data corresponding to the particularsubsection and reading from the group consisting of: data from thestorage device, maintained data, and a combination of maintained dataand data from the storage device. Providing continuous backup of astorage device may also include compressing data by combiningconsecutive portions for a subsection.

According further to the present invention, computer software, in astorage medium, that provides continuous backup of a storage device,includes executable code that obtains a value of a time indicator thatis modified periodically and executable code that, in response to arequest to write new data to a particular subsection of the storagedevice at a particular time, maintains data being overwritten by the newdata according to the particular subsection and according to a value ofthe indicator at the particular time. The subsections may be tracks.Executable code that maintains the data being overwritten may constructa linked list of portions of data for each of the subsections. Theportions of data may have variable sizes. In response to two data writeoperations to a particular subsection at a particular value of theindicator, data being written for each of the two data write operationsmay be combined if data for the second data write operation is a subsetof data for the first data write operation. The computer software mayalso include executable code that restores the storage device to a statethereof at a particular point in time by writing the maintained data tothe storage device. Executable code that writes the maintained data tothe storage device may construct subsections of the data by combiningseparate portions thereof corresponding to the same subsection. Thecomputer software may also include executable code that inserts data fora particular subsection at a particular point in time by traversing datacorresponding to the particular subsection to obtain an appropriateinsertion point. The computer software may also include executable codethat reads data for a particular subsection at a particular point intime by traversing data corresponding to the particular subsection andreading from the group consisting of: data from the storage device,maintained data, and a combination of maintained data and data from thestorage device. The computer software may also include executable codethat compresses data by combining consecutive portions for a subsection.

According further to the present invention, providing continuous backupof a storage device includes subdividing the storage device intosubsections, providing a mirror device of the storage device thatcontains a copy of data that is on the storage device when thecontinuous backup is initiated, providing a time indicator that ismodified periodically, and, in response to a request to write new datato a particular subsection of the storage device at a particular time,maintaining data being overwritten by the new data according to theparticular subsection and according to a value of the indicator at theparticular time, where, for a first write after the continuous backup isinitiated, data from the mirror device is used to maintain data beingoverwritten. The subsections may be tracks. Maintaining the data beingoverwritten may include constructing a linked list of portions of datafor each of the subsections. The portions of data may have variablesizes. In response to two data write operations to a particularsubsection at a particular value of the indicator, data being writtenfor each of the two data write operations may be combined if data forthe second data write operation is a subset of data for the first datawrite operation. Providing continuous backup of a storage device mayalso include restoring the storage device to a state thereof at aparticular point in time by writing the maintained data to the storagedevice. Writing the maintained data to the storage device may includeconstructing subsections of the data by combining separate portionsthereof corresponding to the same subsection. Providing continuousbackup of a storage device may also include inserting data for aparticular subsection at a particular point in time by traversing datacorresponding to the particular subsection to obtain an appropriateinsertion point. Providing continuous backup of a storage device mayalso include reading data for a particular subsection at a particularpoint in time by traversing data corresponding to the particularsubsection and reading from the group consisting of: data from thestorage device, maintained data, and a combination of maintained dataand data from the storage device. Providing continuous backup of astorage device may also include compressing data by combiningconsecutive portions for a subsection.

According further to the present invention, computer software, in astorage medium, that provides continuous backup of a storage device,includes executable code that obtains a value of a time indicator thatis modified periodically and executable code that, in response to arequest to write new data to a particular subsection of the storagedevice at a particular time, maintains data being overwritten by the newdata according to the particular subsection and according to a value ofthe indicator at the particular time where, for a first write after thecontinuous backup is initiated, data used to maintain data beingoverwritten is from a mirror device of the storage device, the mirrordevice containing a copy of data that is on the storage device when thecontinuous backup is initiated. The subsections may be tracks.Executable code that maintains the data being overwritten may constructa linked list of portions of data for each of the subsections. Theportions of data may have variable sizes. In response to two data writeoperations to a particular subsection at a particular value of theindicator, data being written for each of the two data write operationsmay be combined if data for the second data write operation is a subsetof data for the first data write operation. The computer software mayalso include executable code that restores the storage device to a statethereof at a particular point in time by writing the maintained data tothe storage device. Executable code that writes the maintained data tothe storage device may construct subsections of the data by combiningseparate portions thereof corresponding to the same subsection. Thecomputer software may also include executable code that inserts data fora particular subsection at a particular point in time by traversing datacorresponding to the particular subsection to obtain an appropriateinsertion point. The computer software may also include executable codethat reads data for a particular subsection at a particular point intime by traversing data corresponding to the particular subsection andreading from the group consisting of: data from the storage device,maintained data, and a combination of maintained data and data from thestorage device. The computer software may also include executable codethat compresses data by combining consecutive portions for a subsection.

According further to the present invention providing continuous backupfrom a local storage device to a remote storage device includessubdividing the local storage device into subsections, providing a timeindicator that is modified periodically, and, in response to a requestto write new data to a particular subsection of the local storage deviceat a particular time, maintaining at the remote storage device databeing overwritten by the new data according to the particular subsectionand according to a value of the indicator at the particular time. Thesubsections may be tracks. Maintaining the data being overwritten mayinclude constructing a linked list of portions of data for each of thesubsections. The portions of data may have variable sizes. In responseto two data write operations to a particular subsection at a particularvalue of the indicator, data being written for each of the two datawrite operations may be combined if data for the second data writeoperation is a subset of data for the first data write operation.Providing continuous backup from a local storage device to a remotestorage device may also include restoring the local storage device to astate thereof at a particular point in time by writing the maintaineddata to the remote storage device and transferring the data from theremote storage device to the local storage device. Writing themaintained data to the storage device may include constructingsubsections of the data by combining separate portions thereofcorresponding to the same subsection. Providing continuous backup from alocal storage device to a remote storage device may also includeinserting data for a particular subsection at a particular point in timeby traversing data corresponding to the particular subsection to obtainan appropriate insertion point. Providing continuous backup from a localstorage device to a remote storage device may also include providing avirtual storage device at the local storage device, where the virtualstorage device provides access to data maintained at the remote storagedevice, and reading data for a particular subsection at a particularpoint in time by traversing data at the remote storage devicecorresponding to the particular subsection and reading from data fromthe local storage device, maintained data, and a combination ofmaintained data and data from the local storage device, where themaintained data is accessed through the virtual storage device.Providing continuous backup from a local storage device to a remotestorage device may also include compressing data by combiningconsecutive portions for a subsection.

According further to the present invention, computer software, in astorage medium, that provides continuous backup from a local storagedevice to a remote storage device, includes executable code that obtainsa value of a time indicator that is modified periodically and executablecode that, in response to a request to write new data to a particularsubsection of the local storage device at a particular time, maintainsat the remote storage device data being overwritten by the new dataaccording to the particular subsection and according to a value of theindicator at the particular time. The subsections may be tracks.Executable code that maintains the data being overwritten may constructa linked list of portions of data for each of the subsections. Theportions of data may have variable sizes. In response to two data writeoperations to a particular subsection at a particular value of theindicator, data being written for each of the two data write operationsmay be combined if data for the second data write operation is a subsetof data for the first data write operation. The computer software mayalso include executable code that restores the storage device to a statethereof at a particular point in time by writing the maintained data tothe remote storage device and transferring the data from the remotestorage device to the local storage device. Executable code that writesthe maintained data to the storage device may construct subsections ofthe data by combining separate portions thereof corresponding to thesame subsection. The computer software may also include executable codethat inserts data for a particular subsection at a particular point intime by traversing data corresponding to the particular subsection toobtain an appropriate insertion point. The computer software may alsoinclude executable code that reads data for a particular subsection at aparticular point in time by traversing data at the remote storage devicecorresponding to the particular subsection and reading from the localstorage device, maintained data, or a combination of maintained data anddata from the local storage device, where the maintained data isaccessed through a virtual storage device provided at the local storagedevice to access to data maintained at the remote storage device. Thecomputer software may also include executable code that compresses databy combining consecutive portions for a subsection.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a schematic diagram showing a plurality of hosts and a datastorage device used in connection with the system described herein.

FIG. 2 is a schematic diagram showing a storage device, memory, aplurality of directors, and a communication module according to thesystem described herein.

FIG. 3 is a diagram of a storage that shows various logical volumes thatare used in connection with the system described herein.

FIG. 4 is a diagram showing use of a virtual device according to thesystem described herein.

FIG. 5 is a diagram showing use of a plurality of virtual devicesaccording to the system described herein.

FIG. 6 is a diagram showing device tables used in connection with thesystem described herein.

FIG. 7 is a flow chart illustrating reading a table used in connectionwith a virtual device according to the system described herein.

FIG. 8 is a flow chart illustrating writing to a table used inconnection with a virtual device according to the system describedherein.

FIG. 9 is a flow chart illustrating modification of a virtual devicetable and establishing a virtual device according to the systemdescribed herein.

FIG. 10 is a flow chart illustrating modification of data structuresused to handle tracks of a log device according to the system describedherein.

FIG. 11 is a flow chart illustrating steps performed in connection withreading a virtual device according to the system described herein.

FIG. 12 is a flow chart illustrating steps performed by a disk adapterin connection with writing to a standard logical device to which avirtual device has been established according to the system describedherein.

FIG. 13 is a flow chart illustrating steps performed by a host adapterin connection with writing to a standard logical device to which avirtual device has been established according to the system describedherein.

FIG. 14 is a flow chart illustrating steps performed in connection withwriting to a virtual device according to the system described herein.

FIG. 15 is a flow chart illustrating steps performed in connection withremoving a virtual device according to the system described herein.

FIG. 16 is a diagram illustrating a continuous backup virtual deviceaccording to the system described herein.

FIG. 17 is a diagram illustrating a data structure used in connectionwith a continuous backup virtual device according to the systemdescribed herein.

FIG. 18 is a diagram illustrating linked lists used in connection with acontinuous backup virtual device according to the system describedherein.

FIG. 19 is a flow chart illustrating handling a data write operationaccording to the system described herein.

FIG. 20 is a flow chart illustrating handling a data read operationaccording to the system described herein.

FIG. 21 is a flow chart illustrating reading data from an earlier pointin time according to the system described herein.

FIG. 22 is a flow chart illustrating reverting a storage device to astate from an earlier point in time according to the system describedherein.

FIG. 23 is a flow chart illustrating writing data to a storage device ata state from an earlier point in time according to the system describedherein.

FIG. 24 is a flow chart illustrating steps performed in connection withreading or writing data from or to the standard logical device during arestoration process according to an embodiment of the system describedherein.

FIG. 25 is a diagram illustrating synchronizing multiple storage devicefor continuous data backup according to the system described herein.

FIG. 26 is a schematic diagram showing a host, a local storage device,and a remote data storage device used in connection with the systemdescribed herein.

FIG. 27 is a diagram illustrating a continuous backup virtual device forbacking up on a remote storage device according to the system describedherein.

FIG. 28 is a schematic diagram showing a flow of data between a host, alocal storage device, and a remote data storage device used inconnection with the system described herein.

FIG. 29 is a schematic diagram illustrating items for constructing andmanipulating chunks of data on a local storage device according to thesystem described herein.

FIG. 30 is a diagram illustrating a data structure for a slot on a localstorage device used in connection with the system described herein.

FIG. 31 is a flow chart illustrating processing performed in response toa write by a host to a local storage device according to the systemdescribed herein.

FIG. 32 is a flow chart illustrating transferring data from a localstorage device to a remote storage device according to the systemdescribed herein.

FIG. 33 is a flow chart illustrating steps performed in connection witha local storage device incrementing a sequence number according to asystem described herein.

FIG. 34 is a schematic diagram illustrating items for constructing andmanipulating chunks of data on a local storage device according to analternative embodiment of the system described herein.

FIG. 35 is a flow chart illustrating processing performed in response toa write by a host to a local storage device according to an alternativeembodiment of the system described herein.

FIG. 36 is a flow chart illustrating transferring data from a localstorage device to a remote storage device according to an alternativeembodiment of the system described herein.

FIG. 37 is a schematic diagram illustrating a plurality of local andremote storage devices with a host according to the system describedherein.

FIG. 38 is a diagram showing a multi-box mode table used in connectionwith the system described herein.

FIG. 39 is a flow chart illustrating modifying a multi-box mode tableaccording to the system described herein.

FIG. 40 is a flow chart illustrating cycle switching by the hostaccording to the system described herein.

FIG. 41 is a flow chart illustrating steps performed in connection witha local storage device incrementing a sequence number according to asystem described herein.

FIG. 42 is a flow chart illustrating transferring data from a localstorage device to a remote storage device according to the systemdescribed herein.

FIG. 43 is a flow chart illustrating transferring data from a localstorage device to a remote storage device according to an alternativeembodiment of the system described herein.

FIG. 44 is a flow chart illustrating restoring data to a particularpoint in time using a local storage device and a remote storage device.

FIG. 45 is a diagram that illustrates a virtual device provided at alocal storage device providing access to a CB virtual device at a remotestorage device.

FIG. 46 is a diagram illustrating a continuous backup virtual device anda local mirror storage device for backing up on a remote storage deviceaccording to the system described herein.

DETAILED DESCRIPTION OF VARIOUS EMBODIMENTS

Referring to FIG. 1, a diagram 20 shows a plurality of hosts 22 a-22 ccoupled to a data storage device 24. The data storage device 24 includesan internal memory 26 that facilitates operation of the storage device24 as described elsewhere herein. The data storage device also includesa plurality of host adaptors (HA's) 28 a-28 c that handle reading andwriting of data between the hosts 22 a-22 c and the storage device 24.Although the diagram 20 shows each of the hosts 22 a-22 c coupled toeach of the HA's 28 a-28 c, it will be appreciated by one of ordinaryskill in the art that one or more of the HA's 28 a-28 c may be coupledto other hosts.

The storage device 24 may include one or more RDF adapter units (RA's)32 a-32 c. The RA's 32 a-32 c are coupled to an RDF link 34 and aresimilar to the HA's 28 a-28 c, but are used to transfer data between thestorage device 24 and other storage devices (not shown) that are alsocoupled to the RDF link 34. The storage device 24 may be coupled toaddition RDF links (not shown) in addition to the RDF link 34.

The storage device 24 may also include one or more disks 36 a-36 c, eachcontaining a different portion of data stored on the storage device 24.Each of the disks 36 a-36 c may be coupled to a corresponding one of aplurality of disk adapter units (DA) 38 a-38 c that provides data to acorresponding one of the disks 36 a-36 c and receives data from acorresponding one of the disks 36 a-36 c. Note that, in someembodiments, it is possible for more than one disk to be serviced by aDA and that it is possible for more than one DA to service a disk.

The logical storage space in the storage device 24 that corresponds tothe disks 36 a-36 c may be subdivided into a plurality of volumes orlogical devices. The logical devices may or may not correspond to thephysical storage space of the disks 36 a-36 c. Thus, for example, thedisk 36 a may contain a plurality of logical devices or, alternatively,a single logical device could span both of the disks 36 a, 36 b. Thehosts 22 a-22 c may be configured to access any combination of logicaldevices independent of the location of the logical devices on the disks36 a-36 c.

One or more internal logical data path(s) exist between the DA's 38 a-38c, the HA's 28 a-28 c, the RA's 32 a-32 c, and the memory 26. In someembodiments, one or more internal busses and/or communication modulesmay be used. In some embodiments, the memory 26 may be used tofacilitate data transferred between the DA's 38 a-38 c, the HA's 28 a-28c and the RA's 32 a-32 c. The memory 26 may contain tasks that are to beperformed by one or more of the DA's 38 a-38 c, the HA's 28 a-28 c andthe RA's 32 a-32 c, and a cache for data fetched from one or more of thedisks 36 a-36 c. Use of the memory 26 is described in more detailhereinafter.

The storage device 24 may be provided as a stand-alone device coupled tothe hosts 22 a-22 c as shown in FIG. 1 or, alternatively, the storagedevice 24 may be part of a storage area network (SAN) that includes aplurality of other storage devices as well as routers, networkconnections, etc. The storage device may be coupled to a SAN fabricand/or be part of a SAN fabric. The system described herein may beimplemented using software, hardware, and/or a combination of softwareand hardware where software may be stored in an appropriate storagemedium and executed by one or more processors.

Referring to FIG. 2, a diagram 50 illustrates an embodiment of thestorage device 24 where each of a plurality of directors 52 a-52 c arecoupled to the memory 26. Each of the directors 52 a-52 c represents oneof the HA's 28 a-28 c, RA's 32 a-32 c, or DA's 38 a-38 c. In anembodiment disclosed herein, there may be up to sixty four directorscoupled to the memory 26. Of course, for other embodiments, there may bea higher or lower maximum number of directors that may be used.

The diagram 50 also shows an optional communication module (CM) 54 thatprovides an alternative communication path between the directors 52 a-52c. Each of the directors 52 a-52 c may be coupled to the CM 54 so thatany one of the directors 52 a-52 c may send a message and/or data to anyother one of the directors 52 a-52 c without needing to go through thememory 26. The CM 54 may be implemented using conventional MUX/routertechnology where a sending one of the directors 52 a-52 c provides anappropriate address to cause a message and/or data to be received by anintended receiving one of the directors 52 a-52 c. Some or all of thefunctionality of the CM 54 may be implemented using one or more of thedirectors 52 a-52 c so that, for example, the directors 52 a-52 c may beinterconnected directly with the interconnection functionality beingprovided on each of the directors 52 a-52 c. In addition, a sending oneof the directors 52 a-52 c may be able to broadcast a message to all ofthe other directors 52 a-52 c at the same time. Referring to FIG. 3, thestorage device 24 is shown as including a plurality of standard logicaldevices 61-68. Each of the standard logical devices 61-68 may correspondto a volume that is accessible to one or more hosts coupled to thestorage device 24. Each of the standard logical devices 61-68 may or maynot correspond to one of the disk drives 36 a-36 c. Thus, for example,the standard logical device 61 may correspond to the disk drive 36 a,may correspond to a portion of the disk drive 36 a, or may correspond toa portion of the disk drive 36 a and a portion of the disk drive 36 b.

Each of the standard logical devices 61-68 appears to the host as acontiguous block of disk storage, even though each of the standardlogical devices 61-68 may or may not correspond to actual contiguousphysical storage of the disk drives 36 a-36 c. The storage device 24 mayalso includes a plurality of virtual devices 71-74. The virtual devices71-74 appear to a host coupled to the storage device 24 as volumescontaining a contiguous block of data storage. Each of the virtualdevices 71-74 may represent a point in time copy of an entire one of thestandard logical devices 61-68, a portion of one of the standard logicaldevices 61-68, or a combination of portions or entire ones of thestandard logical devices 61-68. However, as described in more detailelsewhere herein, the virtual devices 71-74 do not contain the trackdata from the standard logical devices 61-68. Instead, each of thevirtual devices 71-74 is coupled to a log device 76 or a log device 78that stores some or all the track data, as described in more detailelsewhere herein. The virtual devices 71-74 contain tables that point totracks of data on either on the standard logical devices 61-68 or thelog devices 76, 78. In some instances, a single virtual device may storedata on more than one log device.

The virtual device 71 may represent a point in time copy of the standardlogical device 61. As described in more detail elsewhere herein, thevirtual device 71 is coupled to the log device 76 that contains trackdata to facilitate the virtual device 71 appearing to a host to be apoint in time copy of the standard logical device 61. It is possible formore than one virtual device to use a single log device. Thus, thevirtual devices 72-74 are shown being coupled to the log device 78.Similarly, it is possible for more than one virtual device to representpoint in time copies of a single standard logical device. Thus, thevirtual devices 72,73 are shown as being point in time copies of thestandard logical device 64. The virtual devices 72,73 may represent thesame point in time copy of the standard logical device 64 or,alternatively, may represent point in time copies of the standardlogical device 64 taken at different times. Note that only some of thestandard logical devices 61-68 are shown as being associated with acorresponding one of the virtual devices 71-74 while others of thestandard logical devices 61-68 are not.

In some embodiments, it may be possible to implement the systemdescribed herein using storage areas, instead of storage devices. Thus,for example, the virtual devices 71-74 may be virtual storage areas, thestandard logical devices 61-68 may be standard logical areas, and thelog devices 76,78 may be log areas. In some instances, such animplementation may allow for hybrid logical/virtual devices where asingle logical device has portions that behave as a standard logicaldevice, portions that behave as a virtual device, and/or portions thatbehave as log device. Accordingly, it should be understood that, inappropriate instances, references to devices in the discussion hereinmay also apply to storage areas that may or may not correspond directlywith a storage device.

Referring to FIG. 4, a diagram shows a standard logical device 82, avirtual device 84, and a log device 86. As discussed above, the virtualdevice 84 may represent a point in time copy of all or a portion of thestandard logical device 82. A host coupled to a storage device thataccesses the virtual device 84 may access the virtual device 84 in thesame way that the host would access the standard logical device 82.However, the virtual device 84 does not contain any track data from thestandard logical device 82. Instead, the virtual device 84 includes aplurality of table entries that point to tracks on either the standardlogical device 82 or the log device 86.

When the virtual device is established 84 (e.g., when a point in timecopy is made of the standard logical device 82), the virtual device 84is created and provided with appropriate table entries that, at the timeof establishment, point to tracks of the standard logical device 82. Ahost accessing the virtual device 84 to read a track would read theappropriate track from the standard logical device 82 based on the tableentry of the virtual device 84 pointing to the track of the standardlogical device 82.

After the virtual device 84 has been established, it is possible for ahost to write data to the standard logical device 82. In that case, theprevious data that was stored on the standard logical device 82 iscopied to the log device 86 and the table entries of the virtual device84 that previously pointed to tracks of the standard logical device 82would be modified to point to the new tracks of the log device 86 towhich the data had been copied. Thus, a host accessing the virtualdevice 84 would read either tracks from the standard logical device 82that have not changed since the virtual device 84 was established or,alternatively, would read corresponding tracks from the log device 86that contain data copied from the standard logical device 82 after thevirtual device 84 was established. Adjusting data and pointers inconnection with reads and writes to and from the standard logical device82 and virtual device 84 is discussed in more detail elsewhere herein.

In an embodiment described herein, hosts would not have direct access tothe log device 86. That is, the log device 86 would be used exclusivelyin connection with the virtual device 84 (and possibly other virtualdevices as described in more detail elsewhere herein). In addition, foran embodiment described herein, the standard logical device 82, thevirtual device 84, and the log device 86 may be provided on the singlestorage device 24. However, it is possible to provide the differentlogical devices and the log device on separate storage devicesinterconnected using, for example, the RDF protocol or other remotecommunication protocols. In addition, it may be possible to haveportions of one or more of the standard logical device 82, the virtualdevice 84, and/or the log device 86 provided on separate storage devicesthat are appropriately interconnected.

Referring to FIG. 5, another example of the use of virtual devices showsa standard logical device 92, a plurality of virtual devices 94-97 and alog device 98. In the example of FIG. 5, the virtual device 94represents a point in time copy of the standard logical device 92 takenat ten a.m. Similarly, the virtual device 95 represents a copy of thestandard logical device 92 taken at twelve noon, the virtual device 96represents a copy of the standard logical device 92 taken at two p.m.,and the virtual device 97 represents a copy of the standard logicaldevice 92 taken at four p.m. Note that all of the virtual devices 94-97may share the log device 98. In addition, it is possible for tableentries of more than one of the virtual devices 94-97, or, a subset ofthe table entries of the virtual devices 94-97, to point to the sametracks of the log device 98. For example, the virtual device 95 and thevirtual device 96 are shown as having table entries that point to thesame tracks of the log device 98.

In an embodiment discussed herein, the log device 98 and other logdevices discussed herein are provided by a pool of log devices that ismanaged by the storage device 24. In that case, as a virtual devicerequires additional tracks of a log device, the virtual device wouldcause more log device storage to be created (in the form of more tracksfor an existing log device or a new log device) using the log devicepool mechanism. Pooling storage device resources in this manner is knownin the art. Other techniques that do not use pooling may be used toprovide log device storage.

Referring to FIG. 6, a diagram 100 illustrates tables that are used tokeep track of device information. A first table 102 corresponds to allof the devices used by a storage device or by an element of a storagedevice, such as an HA and/or a DA. The table 102 includes a plurality oflogical device entries 106-108 that correspond to all the logicaldevices used by the storage device (or portion of the storage device).The entries in the table 102 include descriptions for standard logicaldevices, virtual devices, log devices, and other types of logicaldevices.

Each of the entries 106-108 of the table 102 correspond to another tablethat contains information for each of the logical devices. For example,the entry 107 may correspond to a table 112. The table 112 includes aheader that contains overhead information. The table 112 also includesentries 116-118 for each of the cylinders of the logical device. In anembodiment disclosed herein, a logical device may contain any number ofcylinders depending upon how the logical device is initialized. However,in other embodiments, a logical device may contain a fixed number ofcylinders.

The table 112 is shown as including a section for extra track bytes 119.The extra track bytes 119 are used in connection with the log devices ina manner that is discussed elsewhere herein. In an embodiment disclosedherein, there are eight extra track bytes for each track of a logdevice. For devices that are not log devices, the extra track bytes 119may not be used.

Each of the cylinder entries 116-118 corresponds to a track table. Forexample, the entry 117 may correspond to a track table 122 that includesa header 124 having overhead information. The track table 122 alsoincludes entries 126-128 for each of the tracks. In an embodimentdisclosed herein, there are fifteen tracks for every cylinder. However,for other embodiments, it may be possible to have different numbers oftracks for each of the cylinders or even a variable number of tracks foreach cylinder. For standard logical devices and log devices, theinformation in each of the entries 126-128 includes a pointer (eitherdirect or indirect) to the physical address on one of the disk drives42-44 of the storage device 24 (or a remote storage device if the systemis so configured). Thus, the track table 122 may be used to map logicaladdresses of the logical device corresponding to the tables 102, 112,122 to physical addresses on the disk drives 42-44 of the storage device24. For virtual devices, each of the entries 126-128 of the table 122points to a track of a corresponding standard logical device orcorresponding log device. For other embodiments, however, it may bepossible to use a different mechanism where the tables 102, 122, 122 areused only for standard logical devices that contain tracks of data whileanother type of table, such as a simple array of tracks, is used byvirtual devices to map tracks of the virtual devices to tracks ofcorresponding standard logical devices or log devices.

Each track of a log device is either free, meaning that it is not beingused by a virtual device, or is assigned, meaning that the track ispointed to by a table entry in one or more of the virtual devices. In anembodiment disclosed herein, the tracks of a log device are managed byfirst creating a doubly linked list of all of the free tracks of the logdevice. The pointers for the doubly linked list are provided by theextra track bytes 119 of the table 112 so that the extra track bytes 119for a log device contains eight bytes for every track of the log device.For every track of the log device that is free, the extra eight bytesinclude a forward pointer pointing to the next free track of the logdevice and a backward pointer pointing to the previous free track of thelog device. Using a doubly linked list in this manner facilitatesaccessing free tracks of the log device.

In addition, if a track of a log device is assigned (i.e., is used byone or more virtual devices), the corresponding extra track bytes 119for the track may be used to point back to the corresponding track ofthe standard logical device. Thus, when a write is performed to thestandard logical device after the virtual device has been established,the data from the standard logical device is copied to a new track ofthe log device and the extra track bytes corresponding to the new trackof the log device are made to point back to the track of the standardlogical device from which the data came. Having each track of the logdevice point back to the corresponding track of the standard logicaldevice is useful in, for example, data recovery situations.

In addition, for an embodiment disclosed herein, the pointers for theextra eight bytes per track for an assigned track are stored with thedata also. That is, when a particular track of a log device is assigned,the pointer back to the corresponding track of a standard logical deviceis stored with the extra track bytes 119 and, in addition, the pointeris stored with the track data itself on the track of the log device. ForCKD formatted tracks, the extra eight bytes may be stored in block zero.For FBA formatted tracks, the extra eight bytes may be stored in anadditional block appended on the end of the track. In an embodimentdisclosed herein, a block is five hundred and twelve bytes and an FBAtrack contains forty blocks, which is increased to forty one when anadditional block is appended. Different track formats are disclosed, forexample, in U.S. Pat. No. 5,206,939 to Yanai, et al., which isincorporated herein by reference.

The tables 102, 112, 122 of FIG. 6 may be stored in the global memory 46of the storage device 24. In addition, the tables corresponding todevices accessed by a particular host may be stored in local memory ofthe corresponding one of the HA's 32-36. In addition, the RA 48 and/orthe DA's 36-38 may also use and locally store portions of the tables102, 112, 122.

Referring to FIG. 7, a flow chart 140 illustrates steps performed when ahost reads data from a device table corresponding to a track that isaccessible through a virtual device. That is, the flow chart 140illustrates obtaining information about a track that is pointed to by atable entry for a virtual device.

Processing begins at a test step 142 where it is determined if the trackof interest (i.e., the track corresponding to the table entry beingread) is on the standard logical device or the log device. This isdetermined by accessing the device table entry for the virtual deviceand determining whether the table entry for the track of interest pointsto either the standard logical device or the log device. If it isdetermined at the test step 142 that the pointer in the table for thevirtual device points to the standard logical device, then controlpasses from the step 142 to a step 148 where the table entry of interestis read. Following the step 148, processing is complete.

If it is determined that the test step 142 that the pointer in thedevice table for the virtual device for the track of interest points tothe log device, then control transfers from the step 142 to a step 158where the log table entry of interest is read. Following the step 158,processing is complete.

Note that, in some instances, access to data may be controlled by a flagor lock that prohibits multiple processes having access to the datasimultaneously. This is especially useful in instances where a devicetable is being read or modified. The system disclosed hereincontemplates any one of a variety of mechanisms for controlling accessto data by multiple processes, including conventional combinations ofsoftware and/or hardware locks, also known as “flags” or “semaphores”.In some instances, a process accessing data may need to wait untilanother process releases the data. In one embodiment, a hardware lockcontrols access to a software lock (flag) so that a process firstobtains control of the hardware lock, tests the software lock, and then,if the software lock is clear, the process sets the software lock andthen releases the hardware lock. If the process gets the hardware lockand determines that the software lock is not clear, then the processreleases the hardware lock so that another process that has set thesoftware lock can clear the software lock at a later time. Further notethat, in some instances, it is useful to first read a table entrycorresponding to a particular track, read the track into a cache slot(if the track is not already in cache), lock the cache slot, and thenreread the corresponding table entry.

Referring to FIG. 8, a flow chart 170 illustrates steps performed inconnection with writing information to a device table for a virtualdevice corresponding to a standard logical device or a log device.Processing begins at a first step 172 where it is determined if theparticular track corresponding to the device table entry being writtenis on the standard logical device or the log device. If it is determinedthe particular track of interest is on the standard logical device,control passes from the step 172 to a step 178 where the trackcorresponding to the device table entry being written is locked. Lockingthe track at the step 178 prevents other processes from getting accessto the track, and from modifying the corresponding table entry, whilethe current process is modifying the device table entry corresponding tothe track. Following the step 178 is a step 182 where the writeoperation is performed. Following the step 182 is a step 184 where thetrack is unlocked. Following the step 184, processing is complete.

If it is determined that the test step 172 that the track correspondingto the table entry for the virtual device that is being modified pointsto the log device, then control passes from the test step 172 to a step194 where the track of the log device corresponding to the entry of thedevice table that is being written is locked. Following the step 194 isa step 196 where the write operation is performed. Following the step196 is a step 198 where the track is unlocked. Following the step 198,processing is complete.

Referring to FIG. 9, a flow chart 210 illustrates steps performed inconnection with modifying a device table corresponding to a virtualdevice. This may be contrasted with the flow chart 170 of FIG. 8 thatillustrates modifying the device table for the standard logical deviceor the log device pointed to by an entry for a track of the device tablefor a virtual device. In flow chart 210, the device table for thevirtual device is modified, as opposed to the device table for thestandard logical device or the device table for the log device.

Processing begins at a first step 212 where it is determined if themodifications to the table relate to establishing the virtual device. Asdiscussed elsewhere herein, establishing a virtual device includesmaking the virtual device available for access by a host after thevirtual device is created. Establishing a virtual device causes thevirtual device to be associated with a standard logical device (andthus, represent a point in time copy of the standard logical device atthe time of establishment). Prior to being associated with a standardlogical device, a virtual device is not established and is notaccessible by a host. After being established, a virtual device isaccessible by a host.

If it is determined at the step 212 that the modifications to the tablerelate to establishing the virtual device, then control passes from thestep 212 to a step 214 where a device lock for the virtual device is setto prohibit access to the table by other processes. The device lock iscomparable to the cache slot lock, discussed elsewhere herein.

Following the step 214 is a step 216 where the pointers of the virtualdevice table are made to point to tracks of the standard logical deviceand where a protection bit is set for each of the tracks of the standardlogical device that corresponds to the virtual device being established.In an embodiment disclosed herein, each of the tracks of the standardlogical device has sixteen bits which may be set as protection bits, onefor each virtual device established to the standard logical device. Insome embodiments, the protection bits may have uses that are unrelatedto virtual devices. A new virtual device being established may beassigned a new bit position in the sixteen bit field while the bit foreach track of the standard logical device may be set. As discussed inmore detail elsewhere herein, the protection bit being set followed by asubsequent write to the standard logical device indicates that specialprocessing needs to take place to accommodate the virtual deviceestablished to the standard logical device. The special processing isdescribed in more detail elsewhere herein. Also at the step 216, thetrack entries for the device table for the virtual device are allmodified to point to the corresponding tracks of the standard logicaldevice. Thus, when the virtual device is first established, all of thepointers of the device table of the virtual device point to the tracksof the standard logical device.

Following the step 216 is a step 217 the virtual device is set to theready state, thus making the virtual device accessible to hosts.Following the step 217 is a step 218 where the virtual device isunlocked, thus allowing access by other processes. Following the step218, processing is complete.

If it is determined that the test step 212 that the virtual device isnot being established (i.e., some other operation is being performed),then control passes from the test step 212 to a step 222 to lock a trackcorresponding to the entry of the device table for the virtual devicethat is being modified. Note that the track that is locked at the step222 may either be a track on the standard logical device (if the entryof interest in the device table of the virtual device points to thestandard logical device) or a track of the log device (if the entry ofinterest points to the log device). Following the step 222 is a step 224where the modification to the device table for the virtual device isperformed. Following the step 224 is a step 226 where the track isunlocked. Following the step 226, processing is complete.

Referring to FIG. 10, a flow chart 230 illustrates steps performed inconnection with manipulating tracks of a log device. As discussed above,the tracks of a log device are maintained by creating a doubly linkedlist of tracks of the log device that are free (i.e. tracks that areavailable for accepting new data). Thus, if one or more tracks areneeded for use in connection with a corresponding virtual device, thefree tracks are obtained from the doubly linked list, which is modifiedin a conventional manner to indicate that the tracks provided for use bythe virtual device are no longer free. Conversely, if one or more tracksthat are used by one or more virtual devices are no longer needed, thetracks are returned to the doubly linked list, in a conventional manner,in order to indicate that the tracks are free. The flow chart 230 ofFIG. 10 illustrates the steps performed in connection with controllingaccess to the tracks (and track pointers) by multiple processes whichmanipulate the tracks.

Processing begins at a test step 232 where it is determined if theoperation being performed is modifying only tracks that are on the freelist. Note that modifying tracks only on the free lists by, for example,transferring a free track from one part of the list to another part orfrom one free lists to another free list (in the case of multiple freelists), does not involve modifications for tracks corresponding to anydata. If it is determined at the test step 232 that the modificationbeing performed does not involve only tracks on the free list, thencontrol transfers from the step 232 to a step 234 where the track islocked to prevent access by other processes.

Following the step 234 or the step 232 if the step 234 is not reached isa test step 236 where it is determined if the manipulation involves onlyallocated tracks. For any operation involving only allocated tracks, itis not necessary to lock the log device list of free tracks. If itdetermined at the step 236 that the operation being performed is notmanipulating only allocated tracks, then control transfers from the step236 to the step 238 where the log device list of free tracks is lockedto prevent access by other processes.

Following the step 238, or following the step 236 if the step 238 is notexecuted, is a step 242 where the modification is performed. Followingthe step 242 is a test step 244 where it is determined if themanipulation involves only allocated tracks. If it is determined at thetest step 244 that the modification being performed does not involveonly allocated tracks, then control transfers from the step 244 to astep 246 where the log device free list is unlocked. Following the step246 or the step 244 if the step 246 is not reached is a test step 248where it is determined if the operation being performed is modifyingonly tracks that are on the free list. If it determined at the step 248that the operation being performed is modifying only tracks that are onthe free list, then control transfers from the step 248 to the step 252where the track or tracks locked at the step 234 are unlocked. Followingthe step 252, or following the step 248 if the step 252 is not executed,processing is complete.

Referring to FIG. 11, a flow chart 280 illustrates steps performed inconnection with reading data from a virtual device. Processing begins ata test step 282, where it is determined if the device table entry forthe track of interest of the virtual device points to the standardlogical device or points to the log device. If it is determined at thetest step 282 that the table points to the standard logical device, thencontrol passes from the step 282 to a step 284, where the track is readfrom the standard logical device. Following the step 284, processing iscomplete. Alternatively, if it determined at the test step 282 that thedevice table of the virtual device points to the log device, thencontrol passes from the step 282 to a step 286, where the track ofinterest is read from the log device. Following the step 286, processingis complete.

Note that in some instances, it may be possible that prior to the teststep 282, it is determined that the track of interest being read isalready in the cache memory (global memory). In that case, the track maybe obtained from the cache memory without executing any of the steps282, 284, 286.

Referring to FIG. 12, a flow chart 300 illustrates steps performed by aDA in connection with writing to a track of a standard logical device towhich a virtual device has been previously established. Processingbegins at a first step 302 where it is determined if any protection bitsfor the track being written on the standard logical device have beenset. If it determined at the test step 302 that the protection bits arenot set, then control transfers from the step 302 to a step 304, where anormal write operation is performed. That is, at the step 304, data iswritten to the standard logical device in a conventional fashion withoutregard to the existence of a virtual device that had been previouslyestablished to the standard logical device. Following the step 304,processing is complete.

If it is determined at the test step 302 that one or more protectionbits have been set on the track of the standard logical device that isbeing written, control passes from the step 302 to a step 306, where afree track of the log device is obtained. The free track of the logdevice is needed to copy data from the track of the standard logicaldevice. Also, as described in more detail elsewhere herein, free tracksof the log device may be managed using a doubly-linked list of the freetracks. Thus, at the step 306, it may be possible to obtain a free trackby traversing the list of free tracks of the log device and modifyingthe pointers appropriately to remove one of the free tracks for use.

Following the step 306 is a step 308, where, for each virtual devicethat corresponds to a protection bit that was determined to be set atthe test step 302, the pointers of the virtual devices, which initiallypointed to the track being written on the standard logical device, aremodified at the step 308 to point to the free track of the log deviceobtained at the step 306. As discussed above, it is possible to havemore than one virtual device established to a standard logical device.For each virtual device that has been established to a particularstandard logical device, a specific protection bit will be set for eachof the tracks of the standard logical device. Thus, at the step 308, thetrack pointers are changed for all the virtual devices corresponding toa set protection bit detected at the step 302. The track pointers in thedevice tables of virtual devices are modified to point to the new trackthat was obtained at the step 306.

Following the step 308 is a step 312, where the data is caused to becopied from the standard logical device to the new track on the logdevice that was obtained at the step 306. In an embodiment disclosedherein, the data may be copied by moving the data from disk storage tothe global memory of the storage device (e.g., into a cache slot), andthen setting a write pending indicator to cause the data to be copied tothe track of the log device obtained at the step 306. The step 312represents copying the data from the track of the standard logicaldevice that is being written to the new track of the log device obtainedat the step 306. Since all the pointers are modified at the step 308,any virtual device that has been established to the standard logicaldevice prior to the track being written now points to the old data(i.e., the data as it existed on the track of the standard device whenthe virtual devices were established). Note also that, in connectionwith copying the track, the protection bits of the standard logicaldevice track are copied to virtual device map bits for the track on thelog device, which is explained in more detail elsewhere herein.

Following the step 312 is a step 314, where the track of the log deviceobtained at the step 306 is modified so that the extra bytes in thetable (discussed elsewhere herein) are made to point back to the trackof the standard logical device that is being written. Having the trackof the log device point to the corresponding track of the standardlogical device from which the data was provided is useful in manyinstances. For example, it may be useful in connection with datarecovery. Following the step 314 is a step 316, where the protectionbits of the tracks of the standard logical device being written arecleared. Following the step 316 is a step 318, where status is sent tothe HA. Following the step 318, processing is complete.

Note that once the HA receives status, the HA may perform a normal writeoperation and, in that case, at the test step 302, the protection bitswill not be set, since the bits are cleared at the step 316. The HA thatis performing the write operation sees the protection bits that are setat the step 302 and sends a protection request to the appropriate DA.The HA then may disconnect from the DA and wait for status to arrivefrom the DA indicating that a normal write may be performed. While theHA is disconnected and waiting for status from the DA, the DA mayperform the steps disclosed in the flow chart 300. This is described inmore detail below.

Referring to FIG. 13, a flow chart 320 illustrates steps performed by anHA in connection with a write to a standard logical device to which oneor more virtual devices have been established. Processing begins at afirst test step 322, where it is determined if any protection bits areset for the tracks of the standard logical device that are beingwritten. If it is determined at the test step 322 that no protectionbits are set, then control passes from the step 322 to a step 324, wherea normal write is performed. Following the step 324, processing iscomplete.

If it is determined at the test step 322 that one or more protectionbits are set for the tracks of the standard logical device that arebeing written, control passes from the step 322 to a step 326, where theHA sends a request to the DA indicating that protection bits are set forthe tracks. When the DA receives the request that is sent at the step326, the DA performs the operations set forth in the flow chart 300 ofFIG. 12, discussed above. Following the step 326 is a step 328, wherethe HA disconnects from the DA in order to allow (possibly unrelated)operations to be performed with the DA by other processes and/or otherHA's.

Following the step 328 is a step 332, where the HA waits for the DA toperform the operations set forth in the flow chart 300 of FIG. 12 and tosend status to the HA indicating that the appropriate steps have beenperformed to handle the set protection bits. Following the step 332,processing transfers back to the step 322, where the protection bits forthe track of the standard logical device are again tested. Note that ona second iteration, it is expected that the protection bits of the trackof the standard logical device that are being written would be clear atthe step 322, since the DA would have cleared the protection bits inconnection with performing the steps of the flow chart 300. Of course,it is always possible that a new virtual device will be established tothe standard logical device in between the DA clearing the protectionbits and the step 322 being executed again. However, it is usuallyexpected that the second iteration of the step 322 for a particulartrack of the standard logical device will determine that all theprotection bits are clear, and control will transfer from the step 322to the step 324 to perform a normal write.

Referring to FIG. 14, a flow chart 340 illustrates steps performed inconnection with writing to a virtual device. The flow chart 340represents steps performed by both the HA and the DA and thus could havebeen provided as two flow charts, similar to the flow chart 300 of FIG.12 and the flow chart 320 of FIG. 13. However, it will be understood bythose of ordinary skill in the art that the flow chart 340 may representa division of steps similar to those set forth in the flow charts 300,320 and described in the corresponding portions of the text of thespecification.

Processing begins at a first step 342, where it is determined if thevirtual device points to the standard logical device. If so, thencontrol transfers from the test step 342 to a step 344, where a freetrack of the log device is obtained. Following the step 344 is a step346, where data from the standard logical device corresponding to thetrack being written is caused to be copied from the standard logicaldevice to the track of the log device obtained at the step 344.Following the step 346 is a step 348, where the virtual device pointerfor the track is adjusted to point to the track obtained at the step344. Following the step 348 is a step 352, where a protection bitcorresponding to the virtual device is cleared in the track data of thestandard logical device, thus indicating that no special processing onbehalf of the virtual device is required when writing to the track ofthe standard device. Following the step 352 is a step 354, where thewrite is executed. At the step 354, the data to be written may be atrack or a portion of a track that is written to the track obtained atthe step 344. Following the step 354, processing is complete. If thedata corresponds to an entire track, then it may be possible toeliminate the step 346, which copies data from the track of the standardlogical device to the new track of the log device, since writing anentire track's worth of data at the step 354 would overwrite all of thedata copied at the step 346.

If it is determined at the test step 342 that the pointer for the trackof the virtual devices being written does not point to the standardlogical device, then control transfers from the step 342 to a test step356, where it is determined if more than one virtual devices have beenestablished to the standard logical device. If not, then controltransfers from the step 356 to a step 358, where a normal writeoperation to the track of the log device is performed. If it isdetermined at the test step 356 that there is more than one virtualdevice established to the standard logical device, then controltransfers from the step 356 to a step 362, where a free track from thelog device is obtained.

Following the step 362 is a step 364, where the data of the trackcorresponding to the virtual device being written is copied to the trackobtained at the step 362. Following the step 364 is a step 366, wherethe virtual device pointers are adjusted to point to the new track. Inone embodiment, the pointer for the virtual device that is being writtenis made to point to the new track. Alternatively, it is possible to notchange the pointer for the virtual device that is being written and,instead, adjust all the pointers for all of the other virtual devicesthat point to the track at the step 366.

Following the step 366 is a step 368 where the virtual device map bitsfor the tracks of the log device are modified. For the log devicetracks, the virtual device map bits may be used to indicate whichvirtual devices point to each track, where, in one embodiment, there aresixteen virtual device map bits and each bit corresponds to a particularvirtual device. Thus, the test at the step 356 may examine the virtualdevice map bits for the track.

Following the step 368 is a step 369, where the write is executed. Notethat whether the write is executed to the track obtained at the step 362or to the track that is initially pointed to by the virtual device beingwritten depends upon how the pointers are adjusted at the step 366. Inall cases, however, data is written to the track pointed to by thevirtual device to which the data is being written. Following the step369, processing is complete.

Referring to FIG. 15, a flow chart 370 illustrates steps performed inconnection with removing (i.e., eliminating) a virtual device. Once avirtual device has been established and used for its intended purpose,it may be desirable to remove the virtual device. Processing begins at afirst step 372, where a pointer is set to point to the first track ofthe virtual device. The virtual device is removed by examining eachtrack corresponding to the virtual device.

Following the step 372 is a step 374, where it is determined if thetrack of the virtual device that is being examined points to thestandard logical device. If so, then control transfers from the step 374to a step 376 to clear the protection bit on the track of the standardlogical device corresponding to the virtual device being removed.Following the step 376 is a step 378, where a pointer points to the nexttrack of the virtual device in order to continue processing by examiningthe next track. Following the step 378 is a step 382, where it isdetermined if processing complete (i.e., all the tracks of the virtualdevice have been processed). If not, then control transfers from thestep 382 back to the test step 374, discussed above.

If it is determined at the test step 374 that the track of the virtualdevice being examined does not point to the standard logical device,then control transfers from the step 374 to a step 384, where a virtualdevice map bit on the track of the log device that corresponds to thevirtual device being removed is cleared. Each track of the log devicemay have a set of virtual device map bits indicating which virtualdevices use the track of the log device. Thus, at the step 384, thevirtual device map bit corresponding to the virtual device being removedis cleared.

Following the step 384 is a test step 386, where it is determined if thebit that was cleared at the step 384 was the last virtual device map bitthat was set for the track. In other words, the test step 386 determinesif there are other virtual devices that are using the track on the logdevice. If it is determined at the test step 386 that the last virtualdevice map bit was cleared at the step 384 (and thus, no other virtualdevices use the track), then control transfers from the step 386 to astep 388, where the track of the log device is returned to the free listof tracks of the log device, discussed elsewhere herein. Following thestep 388, or following the step 386 if it is determined that the bitcleared at the step 384 is not the last virtual device map bit of thetrack of the log device, is the step 378, discussed above, where thenext track of the virtual device is pointed to for subsequentexamination. Once all of the tracks corresponding to the virtual devicehave been processed, the tables and other data structures associatedwith the virtual device may also be removed although, in someembodiments, the tables and other data structures from the virtualdevice may be maintained, so long as the virtual device is not madeavailable for use by hosts after the virtual device is deestablished.

In some instances, it may be desirable to provide a mechanism forcontinuous or near continuous backup of data. Of course, the systemdescribed above may provide this functionality by simply creating a newvirtual device at each time increment, T, where T is a relatively shortamount of time. Similarly, it may be possible to create a new virtualdevice upon each write of data. However, creating a significant numberof new virtual devices would expend a significant amount of overhead andstorage space in a way that may be undesirable.

Referring to FIG. 16, a diagram 400 illustrates a continuous backup (CB)virtual device 402 that is like the virtual device discussed above withrespect to FIGS. 1-15, but is different in a number of ways (discussedbelow) that facilitate continuous or near continuous backup of data. TheCB virtual device 402 contains pointers to a standard logical device 404for a plurality of tracks such that, for any particular track, if the CBvirtual device 402 points to a corresponding track of the standardlogical device 404, then the corresponding track has not changed sincecreation of the CB virtual device 402. In this respect, the CB virtualdevice 402 is like the virtual device discussed above with respect toFIGS. 1-15. Note that any subsections, besides track, may be used toimplement the system described herein. Accordingly, it should beunderstood in connection with the discussion that follows that althoughtracks are mentioned, other units of data having another size, includingvariable sizes, may be used.

The CB virtual device 402 also contains pointers to a log device 406 fora plurality of corresponding tracks. The log device 406 contains datafor tracks that have changed since creation of the CB virtual device402. However, the contents and data structures used in connection withthe log device 406 are different from those discussed above inconnection with FIGS. 1-15. The log device 406 is discussed in moredetail below.

The diagram 400 also shows an I/O module 408 that handles input andoutput processing to and from other modules, such as input and outputrequests made by the DA's 38 a-38 c and HA's 28 a-28 c shown in FIG. 1.Operation of the I/O module 408 is described in more detail hereinafter.

The I/O module 408 is provided with data from a cycle counter 412 and/ora timer 414. Use of the cycle counter 412 and/or the timer 414 arediscussed in more detail hereinafter. Optionally, the cycle counter 412and/or the timer 414 may be controlled by an external process 416 thatmay be used to synchronize storage for a plurality of storage devices(i.e., a consistency group). This is also discussed in more detailhereinafter.

Referring to FIG. 17, a data structure 450 that may be used to storedata in the log device 406 is illustrated. The data structure 450includes a device info field 452. The device info field includes deviceinformation such as a device identifier, cylinder, head, and lengthidentifiers, and a track ID table. Of course, for different embodiments,different device information may be provided in the device info field452.

The data structure 450 may also include a timer field 454 and/or a cyclecounter field 456. The timer field 454 may correspond to the timer dataelement 414 discussed above in connection with the diagram 400 of FIG.16. Similarly, the cycle counter field 456 may correspond to the cyclecounter data element 412 of the diagram 400 of FIG. 16. The valuesprovided in the fields 454, 456 are the values of the corresponding dataelements 412, 414 at the time each instance of data corresponding to thedata structure 450 is created. This is, in effect, a time stamp.

The data field 462 corresponds to the particular data being stored onthe log device 406 (data being written by a user). In an embodimentherein, the data field 462 may have a variable size so that the amountof data provided with each element varies. In an embodiment herein, thedata provided in the data field 462 does not span multiple tracks andthus is no larger than a single track. The data structure 450 alsoincludes a forward pointer field 464 and a backward pointer field 466for creating a doubly linked list of data elements, as describedelsewhere herein.

Referring to FIG. 18, the log device 406 is shown as including aplurality of doubly linked lists 482-484. Each of the linked lists482-484 contains one or more data elements each having a structure likethat illustrated in FIG. 17 and discussed above. In an embodimentherein, each of the linked lists 482-484 corresponds to a particulartrack of the standard logical device 404 (and thus to a particular trackof the CB virtual device 402). Of course, other data structures may beused, such as singly linked lists. In instances where no data has beenwritten to a particular track since creation of the CB virtual device402, there would be no corresponding linked list stored in the logdevice 406. Otherwise, the appropriate track entry in the CB virtualdevice 402 points to the first element of each of the linked lists482-484. In an embodiment herein, the first element of each of thelinked lists 482-484 is the most recently written element, the nextelement is the next most recently written element, and so on. Of course,any appropriate arrangement of the elements may be used. As mentionedelsewhere herein, each of the elements of each of the linked lists482-484 may contain a partial track's worth of data. Thus, it ispossible that one or more of the linked lists 482-484 does not containan entire track's worth of data.

Referring to FIG. 19, a flow chart 500 illustrates steps performed inconnection with a data write operation according to the system describedherein. Processing begins at a first test step 502 where it isdetermined if the data being written is the first data for a particulartrack (i.e., no previous writes were performed since beginning thecontinuous backup). If the data being written is not the first write,then control transfers from the step 502 to a step 504 where it isdetermined if the current value of the cycle counter 412 equals thevalue of the cycle counter for the most recent data element of thelinked list to which data is being added. In an embodiment herein, themost recent element is pointed to by the CB virtual device 402. Thus,for a write to a particular track, the first element in thecorresponding one of the linked lists 482-484 in the log device 406 isexamined to see if the cycle counter field 456 contains a value thatequals the value stored in the cycle counter data element 412. In anembodiment herein, write operations that occur during the same cyclecounter value are deemed to have occurred at the same time. Therefore,the granularity of the continuous backup is the time between updates ofthe cycle counter. Updating the cycle counter is discussed in moredetail hereinafter.

If it is determined at the test step 504 that the current value of thecycle counter equals the value of the cycle counter stored with the mostrecent data element of the one of the linked lists 482-484 to which datais being written, then control transfers from the step 504 to a teststep 506 which determines if the current write being performed containsdata that will fit within the data field 462 of the most recent dataelement (i.e., if data from the second write operation is a subset ofdata from the first data write operation). As discussed elsewhereherein, the data field 462 is variable length and may or may not be onlya portion of a track. Thus, at the test at the step 506 it is determinedif the data currently being written could overwrite the data field 462of the most recent data element. If so, then control transfers from thetest step 506 to a step 507 where the data is overwritten. Following thestep 507 is a step 508 where a device info field is updated (e.g., thetrack id table is updated) to reflect the overwrite at the step 507.Following the step 508, processing is complete.

If it is determined at the test step 506 that the data currently beingwritten does not fit within the data field 462 of the most recent dataelement for the track, or if it is determined at the test step 504 thatthe current value of the cycle counter does not equal the value in thecycle counter field 456 of the most recent data element, then controltransfers to a step 512 where a new data element is allocated.Allocating a new data element at the step 512 involves obtaining enoughspace for the data structure 450. Note that the size of the datastructure 450 may be a function of the amount of data being written inthe data field 462. In an embodiment herein, this may be unlike theallocation scheme of FIG. 10, which may assume fixed sizes for data. Onthe other hand, the scheme of FIG. 10 may be adapted to accommodate thevariable data sizes used in connection with the step 512 but then, insome cases, it may be useful to allocate a new track on the log deviceto store data when the previous allocation for the same track is notlarge enough.

Following the step 512 is a step 514 where newly allocated data elementis populated by having the data written to the field 462 as well asproviding information for the device info field 452, the timer field454, and the cycle count field 456. Following the step 514 is a step 516where the foreword pointer field 464 is set to the point to the firstdata element of one of the lists 482-484 (or null if the list is empty)and the backward pointer field 466 is set equal to null. Following thestep 516 is a step 518 where the other pointers are adjusted asappropriate (e.g. the backward pointer field 466 of the first dataelement of the list is set to point to the newly allocated dataelement). Following the step 518 is a step 522 where the appropriatetable from the CB virtual device 402 is set to point to the newlyallocated data element. Following the step 522, processing is complete.

If it is determined at the test step 502 that the data being written isthe first data written since beginning continuous backup, then controltransfers from the step 502 to a step 524 where space for an entiretrack's worth of data is allocated. In an embodiment herein, the firstwrite to the standard logical device 404 causes an entire track's worthof data to be copied rather than just the amount of data correspondingto the write, as is done, for example, at the steps 512, 514. In otherembodiments, a different amount of data may be copied from the standardlogical device 404, even on the first write to a particular track.Following the step 524 is a step 526 where the entire track's worth ofdata is copied. Note that the data that is copied at the step 526 is thedata from the standard logical device 404 prior to any modificationsthereto. Following the step 526 is the step 512, discussed above.

It is worth noting that the processing illustrated in the flow chart 500of FIG. 19 may be performed after a write has been accepted andacknowledged to the host. Doing this allows the processing to beperformed at a more convenient time (e.g., when the storage device isless busy) and avoids any appreciable response time penalty by allowingfor the host write to be immediately acknowledged.

Referring to FIG. 20, a flow chart 540 illustrates steps performed inconnection with a read operation to read the present data (i.e., to readthe data in the present state of the storage device). Processing beginsat a first step 542 where it is determined if the CB virtual device 402points to the standard logical device 404. If so, then control transfersfrom the test step 542 to a step 544 where the standard logical deviceis used to retrieve the data being read. Following the step 544,processing is complete.

If it is determined at the test step 542 that the CB virtual device 402does not point to the standard logical device 404, then controltransfers from the test step 542 to a step 546 where a pointer used foriterating through the linked lists of the log device 406 (an iteratingpointer) is set to point to the first element of the list beingprocessed (i.e., the list corresponding to the track from which the datais being read). Following the step 546 is a step 548 where data from theelement being pointed to by the iterating pointer is used to fill in avariable or data space used to accept the data being read. Note that, toread data, it is necessary to process the data elements in inversechronological order, giving precedence to more recent data. Since theelements of the linked lists 482-484 do not each necessarily containedan entire track of data, it may be necessary to traverse throughmultiple data elements to construct the data being requested inconnection with the read operation.

Following the step 548 is a test step 552 where is determined if all ofthe requested data has been retrieved. If so, then processing iscomplete. Otherwise, control transfers from the test step 552 to a step554 where the pointer used to iterate through elements of the list ismade to point to the next element (i.e., using the foreword pointerfield 464). Following the step 554 is a test step 556 where it isdetermined if the pointer used to iterate through elements of the listhas passed the end of the lists (i.e., equals null). If not, thencontrol transfers from the test step 556 back to the step 548 to fill inadditional data, as discussed above. Otherwise, control transfers fromthe test step 556 to a step 558 to fill in the remaining (missing) datawith data from the base track created on the first write to the standardlogical device 404 after the CB virtual device 402 was created. Usingdata at the step 558 means that no corresponding data was written afterthe CB virtual device 402 was created. Following the step 558,processing is complete.

Referring to FIG. 21, a flow chart 570 illustrates steps performed inconnection with reading data from a previous state (previous time) ofthe storage device to obtain data as it existed at a particular time (atarget time). The system described herein may provide continuous ornearly continuous backup of data such that data from any point in time(since initiation of the system) may be read. The steps illustrated bythe flow chart 570 correspond to reading (recovering) data written up toa particular point in time.

Processing begins at a first test step 572 where it is determined if thetrack from which the data is being read has an entry in the CB virtualdevice 402 that points to the standard logical device 404. As discussedelsewhere herein, if an entry for a particular track points to thestandard logical device 404, then the particular track has not beenwritten to since initiation of the system. If it is determined at thetest step 572 that the requested data it is not on the standard logicaldevice 404 (i.e., the corresponding entry in the CB virtual device 402does not point to the standard logical device 404), then controltransfers from the test step 572 to a step 574 where a pointer used toiterate through elements of one of the linked lists 482-484 (i.e., aniterating pointer) is set to point at the first element of the list.

Following the step 574 is a test step 576 where it is determined if thedata element being pointed to by the iterating pointer corresponds todata written after the target time. Note that either the timer field 454or the cycle counter field 456 may be used to specify a particular timeof interest and to determine if the current data is after the targettime. Use of the timer and the cycle counter is discussed in more detailhereinafter.

If it is determined at the test step 576 that the data element beingpointed to by the pointer used to iterate through the elements of thelist corresponds to data written after the target time, then controltransfers from the test step 576 to a step 578 where the iteratingpointer is made to point to the next data element. Following the step578 is a test step 582 which determines if the iterating pointer pointsto the end of the list. If not, then control transfers from the teststep 582 back to the test step 576, discussed above.

If it is determined at the test step 582 that the iterating pointerpoints to the end of the list of elements (i.e., points to null), thencontrol transfers from the test step 582 to a step 584 where thestandard logical device 404 is used to provide the requested data. Insuch a situation, all of the write operations have occurred after thetarget time so that the desired data is stored on the standard logicaldevice 404. Following the step 584, processing is complete. Note thatthe step 584 is also reached from the test step 572 if it is determinedthat the entry in the CB virtual device 402 points to the standardlogical device 404, indicating that no writes have occurred to theparticular track since initiation of the system.

If it is determined at the test step 576 that the iterating pointerpoints to data that is not after the target time, then control transfersfrom the test step 576 to a step 588 where the data from the elementbeing pointed to by the iterating pointer is used to fill in a variableor data space used to accept the data being read. Note that, to read thedata, it may be necessary to process the data elements in inversechronological order, giving precedence to more recent data. However,since the elements of the linked lists 482-484 do not each necessarilycontained an entire track of data, it may be necessary to traversethrough multiple data elements to construct the data being requested inconnection with the read operation.

Following the step 588 is a test step 592 where is determined if all ofthe requested data has been retrieved. If so, then processing iscomplete. Otherwise, control transfers from the test step 592 to a step594 where the iterating pointer is made to point to the next element(i.e., using the foreword pointer field 464). Following the step 594 isa test step 596 where it is determined if the iterating pointer haspassed the end of the list (i.e., equals null). If not, then controltransfers from the test step 596 back to the step 588 to fill in anyadditional data, as discussed above. Otherwise, control transfers fromthe test step 596 to a step 598 to fill in the remaining (missing) datawith data from the base track created on the first write to the standardlogical device 404 after the CB virtual device 402 was created. Usingdata at the step 598 means that no corresponding data was written afterthe CB virtual device 402 was created. Following the step 598,processing is complete.

Referring to FIG. 22, a flow chart 610 illustrates steps performed inconnection with reverting data to its state at a particular time ofinterest (target time) and possibly writing new data to the reverteddata or even inserting new data as if it had been written at a previouspoint in time. Processing begins at a first step 612 where a pointerthat is used to iterate through all of the tracks (track iteratingpointer) is set to point to the first track of the CB virtual device402. Following the step 612 is a test step 614 which determines if thecorresponding entry for the CB virtual device 402 points to the standardlogical device 404. If not, then control transfers from the test step614 to a step 616 where a pointer (element iterating pointer) used toiterate through corresponding elements of the log device 406 is set topoint to the last (most recent) element of the linked list correspondingto the particular track pointed to by the track iterating pointer.

Following the step 616 is a test step 618 which determines if theelement iterating pointer points to an element having a time that isafter the target time (desired restoration time). If so, then controltransfers from the test step 618 to a step 622 where the element pointedto by the element iteration pointer is disposed (i.e., the memory usedby the element is freed for use in some fashion consistent with thememory management scheme that is used). Following the step 622 is a step624 where the appropriate pointers are adjusted. At the step 624, theelement iteration pointer is made to point to the next most recentelement. In addition, if data is disposed at the step 622, pointers usedby the data structures for the CBVirtual device 402, the log device 406,etc. may also be adjusted. However, as discussed in more detail below,for alternative embodiments, no data may be disposed. Following the step624, control transfer back to the step 618 for the next iteration. Notethat the target time may be expressed either in terms of a particularvalue for the cycle counter or a particular value for the timer.

An alternative embodiment is illustrated by a path 625 from the step 618directly to the step 624 when the element iterating pointer points to anelement having a time that is after the desired restoration time. Inthis embodiment, data that is after the restoration time is notdiscarded.

If it is determined at the test step 618 that the element pointed to bythe element iteration pointer has a time associated therewith that isnot after the target time, then control transfers from the test step 618to a step 626 where data is accumulated corresponding to the data thatwill be written back to the standard logical device 404 to cause thestandard logical device 404 to revert to the state thereof at the targettime. Accumulating the data at the step 626 may include starting withnew data to be written (if any) and then filling in any gaps (e.g.,parts of a track that are not being written with new data) using, forexample, processing like that illustrated in connection with the flowchart 570 of FIG. 21. Following the step 626 is a step 627 where thedata accumulated at the step 626 is written to the standard logicaldevice 404. The write at the step 627 could be a conventional write orcould be a continuous backup write. Following the step 627 is the step628, discussed above.

Note that once the CB virtual device 402 has been restored to aparticular state, it is possible to continue operation, includingproviding new data writes to the system. In some instances, it may bedesirable to insert new data (write new data) at a particular targettime or delete data from a particular target time. Note that these twooperations may be used together in a way that allows a user to insertdata, test the result thereof, and then subsequently delete the inserteddata. For example, a user may discover a data inconsistency in adatabase at 5:00 p.m. and may attempt to address the inconsistency bysimulating a writing of additional data (or different data) at 3:00 p.m.However, if that does not fix the problem, the user may desire to undothe simulated write and try something else. Of course, the ability toinsert and delete data at different points in time may have any numberof uses.

Referring to FIG. 23, a flow chart 650 illustrates steps performed inconnection with inserting new data at a particular target time (i.e.,writing the data as if it occurred at a particular target time that maybe prior to the current time and prior to subsequent data writeoperations) or deleting data from a particular target time. Processingfor the flow chart 650 begins at a first test step 652 where it isdetermined if the entry for the track for the CB virtual device 402points to the standard logical device 404. If so, then control transfersfrom the test step 652 to a test step 653 where it is determined if adata insert is being performed. If so, then control transfers from thestep 653 to a step 654 where a normal write operation is performed. Notethat, in the case of possibly removing data, there is no data to deleteif the CB virtual device 402 points to the standard logical device 404.Following the step 654, processing is complete.

If it is determined at the test step 652 that the track for the CBvirtual device 402 does not point to the standard logical device 404,then control transfers from the test step 652 to a step 656 where anelement iteration pointer is made to point to the first element in thelinked list of elements corresponding to the track. Following the step656 is a test step 658 where it is determined if the element iterationpointer points past the end of the list. If not, then control transfersfrom the test step 658 to a test step 662 where it is determined if theelement iteration pointer points to an element having a time associatedtherewith that is after the target time. If so, then control transfersfrom the test step 662 to a step 664 where the element iteration pointeris made to point to the next element. Following the step 664, controltransfers back to the test step 658, discussed above.

If it is determined at the test step 658 that the element iterationpointer points past the end of the linked list, or if it is determinedat the test step 662 that the element iteration pointer points to anelement having a time associated therewith that is not after the targettime, then control transfers to a step 666 where a new element,corresponding to the data to be inserted, is added or where the elementpointed to is deleted. In other embodiments, elements to be deleted maybe specially marked or tagged (e.g., at the time of insertion) or theremay be any one of a number of techniques used to identify elements to bedeleted.

Following the step 666 is a step 668 where pointers of the linked listare adjusted to accommodate the addition of the new element or deletionof one or more elements. Following the step 668, processing is complete.

Note that, in the case of removing data elements or adding multiple dataelements, it may be possible to iteratively execute some or all of thesteps of the flow chart 650 of FIG. 23. For example, to remove allinstances of data having a particular characteristic or mark, the stepsof the flow chart 650 may be executed for each track of interest.

Referring to FIG. 24, a flow chart 680 illustrates steps performed inconnection with a read or write operation from or to the standardlogical device 404 while the standard logical device 404 is beingreturned to a state corresponding to an earlier point in time. In someembodiments, it may be possible to suspend all I/O operations for thestandard logical device 404 while a restore is being performed like therestore operation illustrated by the flow chart 610 of FIG. 22,described above. However, in other instances, suspending I/O operationsmay be unacceptable, in which case it may become necessary to allow I/Ooperations while the restore is being performed.

Processing for the flow chart 680 begins at a test step 682 where it isdetermined whether the track being accessed needs to be restored. Insome cases, the track being accessed may have already been restored ormay not need to be restored because the track was never modified. In anycase, if it is determined at the test step 682 that the track beingaccessed needs to be restored (i.e., the track was modified and has notyet been restored), then control passes from the step 682 to a step 684where the restore operation is performed for the track. The restoreoperation performed at the step 684 is like the restore operationdescribed above in connection with the flow chart 610 of FIG. 22.Following the step 684, or following the step 682 if no restore isneeded is a step 686 where a normal read or write operation is performedto the (now restored) track of the standard logical device. Followingthe step 686, processing is complete.

Referring to FIG. 25, a diagram 700 illustrates a plurality of storagedevices 702-704 coupled to the external process 416 illustrated in thediagram 400 of FIG. 16. The storage devices 702-704 may be part of aconsistency group. The external process 416 may be used to synchronizethe storage devices 702-704 by synchronizing the timer 414 and/or thecycle counter 412. In an embodiment herein, synchronization may beperformed by temporarily suspending write operations prior to updatingthe timer 414 and/or the cycle counter 412. Alternatively, the externalprocess 416 may wait for write operations for the storage devices702-704 to become quiescent prior to updating the timer 414 and/or thecycle counter 412.

Note that initialization of the system described herein may be performedby simply creating the CB virtual device 402, setting the cycle counter412 and the timer 414 to appropriate initial values, and beginningoperation. In embodiments where the external process is used 416, theexternal process may also be initialized and may also be used tosimultaneously begin continuous backup operations for multiple storagedevices. Otherwise, in embodiments where the external process 416 (orthe equivalent) is not used, then updating the cycle counter 412 and/orthe timer 414 may be performed by any appropriate means, including bythe process 408 that handles input and output operations.

In some cases, it may be desirable to provide continuous backup to astorage device that is different from the storage device written to bythe host. The host may be coupled to a first (local) storage device andthe first storage device may be coupled to a second (remote) storagedevice that maintains the continuous backup of data as described herein.In some embodiments, a continuous backup may be maintained on the localstorage device and the remote storage device while in other embodimentsthe continuous backup may be maintained on the remote storage deviceonly.

Referring to FIG. 26, a diagram 820 shows a relationship between a host822, a local storage device 824 and a remote storage device 826. Thestorage devices 824, 826 may be like the storage device 24, discussedabove. The host 822 reads and writes data from and to the local storagedevice 824. Although the diagram 820 only shows one host 822, it will beappreciated by one of ordinary skill in the art that multiple hosts arepossible. Data from the local storage device 824 may be transferred tothe remote storage device 826 via a link therebetween. Although only theone link is shown, it is possible to have additional links between thestorage devices 824, 826 and to have links between one or both of thestorage devices 824, 826 and other storage devices (not shown).

In an embodiment herein, data written from the host 822 to the localstorage device 824 is continuously backed up at the remote storagedevice 826 using processing at the remote storage device 826 like thatdescribed herein in connection with FIGS. 19-24. In some embodiments,the data may also be continuously backed up at the local storage device824 while in other embodiments the data is only continuously backed upat the remote storage device 826. As described in more detail elsewhereherein, data written by the host 822 is associated with a particularcycle number by the local storage device 824. The cycle number assignedby the local storage device corresponds to the cycle counter field 456described above in connection with FIG. 17. The data and the cyclenumber associated there with are then transmitted to the remote storagedevice 826, which has a standard logical device, CB virtual device, andlog device for performing the continuous backup. Associating datawritten by the host 822 with cycle numbers and transferring the datafrom the local storage device 824 to the remote storage device 826 isdiscussed in more detail elsewhere herein.

Referring to FIG. 27, a diagram 830 illustrates a continuous backup (CB)virtual device 832 that is like the CB virtual device 402 discussedabove elsewhere herein. The CB virtual device 832 contains pointers to astandard logical device 834 for a plurality of tracks such that, for anyparticular track, if the CB virtual device 832 points to a correspondingtrack of the standard logical device 834, then the corresponding trackhas not changed since creation of the CB virtual device 832. The CBvirtual device 832 also contains pointers to a log device 836 for aplurality of corresponding tracks. The log device 836 contains data fortracks that have changed since creation of the CB virtual device 832 andis like the log device 406 discussed above. The CB virtual device 832,the standard logical device 834, and the log device 836 may all beprovided on the remote storage device 826.

The diagram 830 also shows an I/O module 838 that handles receipt andstorage of data received by the remote storage device 826 for continuousbackup storage. Operation of the I/O module 838 is like operation of theI/O module 408 discussed elsewhere herein. The I/O module 838 isprovided with data stored in temporary storage 842 of the remote storagedevice 826. The temporary storage 842 may be implemented using, forexample, volatile memory of the remote storage device 826 and/ordedicated disk storage space of the remote storage device 826. Datawithin the temporary storage 842 is provided to the remote storagedevice 826 from the local storage device 824 as described in more detailelsewhere herein.

The following discussion relates to providing continuous backup on theremote storage device 826 of data on the local storage device 824.Multiple embodiments are disclosed for different ways that data havingan appropriate sequence number (cycle number) may be provided to theremote storage device 826 from the local storage device 824.

Referring to FIG. 28, a path of data is illustrated from the host 822 tothe local storage device 824 and the remote storage device 826. Datawritten from the host 822 to the local storage device 824 may be storedlocally, as illustrated by the data element 851 of the local storagedevice 824. Storing the data locally may include writing the datadirectly to a logical storage device of the local storage device 824and/or providing the continuous backup functionality described herein atthe local storage device 824. The data that is written by the host 822to the local storage device 824 may also be maintained by the localstorage device 824 in connection with being sent by the local storagedevice 824 to the remote storage device 826.

In the system described herein, each data write by the host 822 (of, forexample a record, a plurality of records, a track, etc.) is assigned asequence number (cycle number). The sequence number may be provided inan appropriate data field associated with the write. In FIG. 28, thewrites by the host 822 are shown as being assigned sequence number N.All of the writes performed by the host 822 that are assigned sequencenumber N are collected in a single chunk of data 852. The chunk 852represents a plurality of separate writes by the host 822 that occur atapproximately the same time.

Generally, the local storage device 824 accumulates chunks of onesequence number while transmitting a previously accumulated chunk(having the previous sequence number) to the remote storage device 826.Thus, while the local storage device 824 is accumulating writes from thehost 822 that are assigned sequence number N, the writes that occurredfor the previous sequence number (N−1) are transmitted by the localstorage device 824 to the remote storage device 826. A chunk 854represents writes from the host 822 that were assigned the sequencenumber N−1 that have not been transmitted yet to the remote storagedevice 826.

The remote storage device 826 receives the data from the chunk 854corresponding to writes assigned a sequence number N−1 and constructs anew chunk 856 of host writes having sequence number N−1. When the remotestorage device 826 has received all of the data from the chunk 854, thelocal storage device 824 sends a commit message to the remote storagedevice 826 to commit all the data assigned the N−1 sequence numbercorresponding to the chunk 856. Generally, once a chunk corresponding toa particular sequence number is committed, that chunk may be written tothe logical storage device of the remote storage device 826 and/or beused for providing continuous backup at the remote storage device 826.This is illustrated in FIG. 28 with a chunk 858 corresponding to writesassigned sequence number N−2 (i.e., two before the current sequencenumber being used in connection with writes by the host 822 to the localstorage device 826).

In FIG. 28, the chunk 858 is shown as being written to a data element862 representing disk storage and/or continuous backup at the remotestorage device 826. Thus, the remote storage device 826 is receiving andaccumulating the chunk 856 corresponding to sequence number N−1 whilethe chunk 858 corresponding to the previous sequence number (N−2) isbeing written to disk storage of the remote storage device 826 and/orbeing used for remote continuous backup as illustrated by the dataelement 862.

Thus, in operation, the host 822 writes data to the local storage device824 that is stored locally in the data element 851, possiblycontinuously backed up at the local storage device 824, and accumulatedin the chunk 852. Once all of the data for a particular sequence numberhas been accumulated (described elsewhere herein), the local storagedevice 824 increments the sequence number. Data from the chunk 854corresponding to one less than the current sequence number istransferred from the local storage device 824 to the remote storagedevice 826. The chunk 858 corresponds to data for a sequence number thatwas committed by the local storage device 824 sending a message to theremote storage device 826. Data from the chunk 858 is written to diskstorage of the remote storage device 826 and/or continuously backed upat the remote storage device 826.

Referring to FIG. 29, a diagram 870 illustrates items used to constructand maintain the chunks 852, 854. A standard logical device 872 providedon the local storage device 824 contains data written by the host 822and corresponds to the data element 851 of FIG. 28. The standard logicaldevice 872 contains data written by the host 822 to the local storagedevice 824.

Two linked lists of pointers 874, 876 are used in connection with thestandard logical device 872. The linked list 874 contains a plurality ofpointers 881-885, each of which points to a portion of data used inconnection with the local storage device 824. The data may be providedin a cache memory 888 of the local storage device 824. Similarly, thelinked list 876 contains a plurality of pointers 891-895, each of whichpoints to a portion of data provided in the cache memory 888. The cachememory 888 contains a plurality of cache slots 902-904 that may be usedin connection to writes to the standard logical device 872 and, at thesame time, used in connection with the linked lists 874, 876.

Each of the linked lists 874, 876 may be used for one of the chunks ofdata 852, 854 so that, for example, the linked list 874 may correspondto the chunk of data 852 for sequence number N while the linked list 876may correspond to the chunk of data 854 for sequence number N−1. Thus,when data is written by the host 822 to the local storage device 824,the data is provided to the cache 888 and, in some cases (describedelsewhere herein), an appropriate pointer of the linked list 874 iscreated. Note that the data will not be removed from the cache 888 untilthe data is destaged to the standard logical device 872 and the data isalso no longer pointed to by one of the pointers 881-885 of the linkedlist 874, as described elsewhere herein.

In an embodiment herein, one of the linked lists 874, 876 is deemed“active” while the other is deemed “inactive”. Thus, for example, whenthe sequence number N is even, the linked list 874 may be active whilethe linked list 876 is inactive. The active one of the linked lists 874,876 handles writes from the host 822 while the inactive one of thelinked lists 874, 876 corresponds to the data that is being transmittedfrom the local storage device 824 to the remote storage device 826.While the data that is written by the host 822 is accumulated using theactive one of the linked lists 874, 876 (for the sequence number N), thedata corresponding to the inactive one of the linked lists 874, 876 (forprevious sequence number N−1) is transmitted from the local storagedevice 824 to the remote storage device 826.

Once data corresponding to a particular one of the pointers in one ofthe linked lists 874, 876 has been transmitted to the remote storagedevice 826, the particular one of the pointers may be removed from theappropriate one of the linked lists 874, 876. In addition, the data mayalso be marked for removal from the cache 888 (i.e., the slot may bereturned to a pool of slots for later, unrelated, use) provided that thedata in the slot is not otherwise needed for another purpose (e.g., tobe destaged to the standard logical device 872). A mechanism may be usedto ensure that data is not removed from the cache 888 until all devicesare no longer using the data.

Referring to FIG. 30, a slot 920, like one of the slots 902-904 of thecache 888, includes a header 922 and data 924. The header 922corresponds to overhead information used by the system to manage theslot 920. The data 924 is the corresponding data that is being(temporarily) stored in the slot 920. Information in the header 922includes pointers back to disk storage of the local storage device 824,time stamp(s), etc.

The header 922 also includes a cache stamp 926 used in connection withthe system described herein. In an embodiment herein, the cache stamp926 is eight bytes. Two of the bytes are a “password” that indicateswhether the slot 920 is being used by the system described herein. Inother embodiments, the password may be one byte while the following byteis used for a pad. As described elsewhere herein, the two bytes of thepassword (or one byte, as the case may be) being equal to a particularvalue indicates that the slot 920 is pointed to by at least one entry ofthe linked lists 874, 876. The password not being equal to theparticular value indicates that the slot 920 is not pointed to by anentry of the linked lists 874, 876. Use of the password is describedelsewhere herein.

The cache stamp 926 also includes a two byte field indicating thesequence number (e.g., N,N−1, N−2, etc.) of the data 924 of the slot920. As described elsewhere herein, the sequence number field of thecache stamp 926 may be used to facilitate the processing describedherein. The remaining four bytes of the cache stamp 926 may be used fora pointer, as described elsewhere herein. Of course, the two bytes ofthe sequence number and the four bytes of the pointer are only validwhen the password equals the particular value that indicates that theslot 920 is pointed to by at least one entry in one of the lists 874,876.

Referring to FIG. 31, a flow chart 940 illustrates steps performed bythe local storage device 824 in connection with the host 822 performinga write operation. Of course, when the host 822 performs a write,processing occurs for handling the write in a normal fashionirrespective of whether the data is being continuously backed up at theremote storage device 826.

Processing begins at a first step 942 where a slot corresponding to thewrite is locked. In an embodiment herein, each of the slots 902-904 ofthe cache 888 corresponds to a track of data on the standard logicaldevice 872. Locking the slot at the step 942 prevents additionalprocesses from operating on the relevant slot during the processingperformed by the local storage device 824 corresponding to the steps ofthe flow chart 940.

Following step 942 is a step 944 where a value for N, the sequencenumber, is set. As discussed elsewhere herein, the value for thesequence number obtained at the step 944 is maintained during the entirewrite operation performed by the local storage device 824 while the slotis locked. As discussed elsewhere herein, the sequence number isassigned to each write to set the one of the chunks of data 852, 854 towhich the write belongs. Writes performed by the host 822 are assignedthe current sequence number. It is useful that a single write operationmaintain the same sequence number throughout.

Following the step 944 is a test step 946 which determines if thepassword field of the cache slot is valid. As discussed above, thesystem described herein sets the password field to a predetermined valueto indicate that the cache slot is already in one of the linked lists ofpointers 874, 876. If it is determined at the test step 946 that thepassword field is not valid (indicating that the slot is new and that nopointers from the lists 874, 876 point to the slot), then control passesfrom the step 946 to a step 948, where the cache stamp of the new slotis set by setting the password to the predetermined value, setting thesequence number field to N, and setting the pointer field to Null. Inother embodiments, the pointer field may be set to point to the slotitself.

Following the step 948 is a step 952 where a pointer to the new slot isadded to the active one of the pointer lists 874, 876. In an embodimentherein, the lists 874, 876 are circular doubly linked lists, and the newpointer is added to the circular doubly linked list in a conventionalfashion. Of course, other appropriate data structures could be used tomanage the lists 874, 876. Following the step 952 is a step 954 whereflags are set. At the step 954, a write pending flag may set to indicatethat the slot needs to be transmitted to the remote storage device 826.In addition, at the step 954, an in cache flag may be set to indicatethat the slot needs to be destaged to the standard logical device 872.Following the step 954 is a step 956 where the data being written by thehost 822 is written to the slot. Following the step 956 is a step 958where the slot is unlocked. Following step 958, processing is complete.

If it is determined at the test step 946 that the password field of theslot is valid (indicating that the slot is already pointed to by atleast one pointer of the lists 874, 876), then control transfers fromthe step 946 to a test step 962, where it is determined whether thesequence number field of the slot is equal to the current sequencenumber, N. Note that there are two valid possibilities for the sequencenumber field of a slot with a valid password. It is possible for thesequence number field to be equal to N, the current sequence number.This occurs when the slot corresponds to a previous write with sequencenumber N. The other possibility is for the sequence number field toequal N−1. This occurs when the slot corresponds to a previous writewith sequence number N−1. Any other value for the sequence number fieldis invalid. Thus, for some embodiments, it may be possible to includeerror/validity checking in the step 962 or possibly make error/validitychecking a separate step. Such an error may be handled in anyappropriate fashion, which may include providing a message to a user.

If it is determined at the step 962 that the value in the sequencenumber field of the slot equals the current sequence number N, then nospecial processing is required and control transfers from the step 962to the step 956, discussed above, where the data is written to the slot.Otherwise, if the value of the sequence number field is N−1 (the onlyother valid value), then control transfers from the step 962 to a step964 where a new slot is obtained. The new slot obtained at the step 964may be used to store the data being written.

Following the step 964 is a step 966 where the data from the old slot iscopied to the new slot that was obtained at the step 964. Note that thatthe copied data includes the write pending flag, which should have beenset at the step 954 on a previous write when the slot was first created.Following the step 966 is a step 968 where the cache stamp for the newslot is set by setting the password field to the appropriate value,setting the sequence number field to the current sequence number, N, andsetting the pointer field to point to the old slot. Following the step968 is a step 972 where a pointer to the new slot is added to the activeone of the linked lists 874, 876. Following the step 972 is the step956, discussed above, where the data is written to the slot which, inthis case, is the new slot.

Referring to FIG. 32, a flow chart 1000 illustrates steps performed inconnection with the local storage device 824 scanning the inactive oneof the lists 872, 874 to transmit data from the local storage device 824to the remote storage device 826 when the data has been accumulatedaccording to the embodiment illustrated in connection with FIG. 29. Asdiscussed above, the inactive one of the lists 872, 874 points to slotscorresponding to the N−1 cycle for the local storage device 824 when theN cycle is being written to the local storage device 824 by the host 822using the active one of the lists 872, 874.

Processing begins at a first step 1002 where it is determined if thereare any entries in the inactive one of the lists 872, 874. As data istransmitted, the corresponding entries are removed from the inactive oneof the lists 872, 874. In addition, new writes are provided to theactive one of the lists 872, 874 and not generally to the inactive oneof the lists 72, 74. Thus, it is possible (and desirable, as describedelsewhere herein) for the inactive one of the lists 872, 874 to containno data at certain times. If it is determined at the step 1002 thatthere is no data to be transmitted, then the inactive one of the lists872, 874 is continuously polled until data becomes available. Data forsending becomes available in connection with a cycle switch (discussedelsewhere herein) where the inactive one of the lists 872, 874 becomesthe active one of the lists 872, 874, and vice versa.

If it is determined at the step 1002 that there is data available forsending, control transfers from the step 1002 to a step 1004, where theslot is verified as being correct. The processing performed at the step1004 is an optional “sanity check” that may include, for example,verifying that the password field is correct and verifying that thesequence number field is correct. If there is incorrect (unexpected)data in the slot, error processing may be performed, which may includenotifying a user of the error and possibly error recovery processing.

Following the step 1004 is a step 1012, where the data is sent from thelocal storage device 824 to the remote storage device 826 in anappropriate manner. In an embodiment herein, the entire slot is nottransmitted. Rather, only records within the slot that have theappropriate mirror bits set (indicating the records have changed) aretransmitted to the remote storage device 826. However, in otherembodiments, it may be possible to transmit the entire slot, providedthat the remote storage device 826 only writes data corresponding torecords having appropriate mirror bits set and ignores other data forthe track, which may or may not be valid. Following the step 1012 is atest step 1014 where it is determined if the data that was transmittedhas been acknowledged by the remote storage device 826. If not, the datais resent, as indicated by the flow from the step 1014 back to the step1012. In other embodiments, different and more involved processing mayused to send data and acknowledge receipt thereof. Such processing mayinclude error reporting and alternative processing that is performedafter a certain number of attempts to send the data have failed.

Once it is determined at the test step 1014 that the data has beensuccessfully sent, control passes from the step 1014 to a step 1016 toclear the write pending flag (since the data has been successfullysent). Following the step 1016 is a test step 1018 where it isdetermined if the slot is a duplicate slot created in connection with awrite to a slot already having an existing entry in the inactive one ofthe lists 872, 874. This possibility is discussed above in connectionwith the steps 962, 964, 966, 968, 972. If it is determined at the step1018 that the slot is a duplicate slot, then control passes from thestep 1018 to a step 1022 where the slot is returned to the pool ofavailable slots (to be reused). In addition, the slot may also be aged(or have some other appropriate mechanism applied thereto) to providefor immediate reuse ahead of other slots since the data provided in theslot is not valid for any other purpose. Following the step 1022 or thestep 1018 if the slot is not a duplicate slot is a step 1024 where thepassword field of the slot header is cleared so that when the slot isreused, the test at the step 946 of FIG. 31 properly classifies the slotas a new (unused) slot.

Following the step 1024 is a step 1026 where the entry in the inactiveone of the lists 872, 874 is removed. Following the step 1026, controltransfers back to the step 1002, discussed above, where it is determinedif there are additional entries on the inactive one of the lists 872,874 corresponding to data needing to be transferred.

Referring to FIG. 33, a flow chart 1050 illustrates steps performed inconnection with the local storage device 824 increasing the sequencenumber. Processing begins at a first step 1052 where the local storagedevice 824 waits at least M seconds prior to increasing the sequencenumber. In an embodiment herein, M is thirty, but of course M could beany number. Larger values for M increase the amount of data that may belost if communication between the storage devices 824, 826 is disrupted.However, smaller values for M increase the total amount of overheadcaused by incrementing the sequence number more frequently.

Following the step 1052 is a test step 1054 which determines if allwrite operations to the local storage device 824 associated with theprevious sequence number have completed. In some instances, a single I/Omay take a relatively long time and may still be in progress even afterthe sequence number has changed. Any appropriate mechanism may be usedat the step 1054.

If it is determined at the test step 1054 that I/O's from the previoussequence number have been completed, then control transfers from thestep 1054 to a test step 1056 which determines if the inactive one ofthe lists 874, 876 is empty. Note that a sequence number switch may notbe made unless and until all of the data corresponding to the inactiveone of the lists 874, 876 has been completely transmitted from the localstorage device 824 to the remote storage device 826. Once the inactiveone of the lists 874, 876 is determined to be empty, then controltransfers from the step 1056 to a step 1058 where the commit for theprevious sequence number is sent from the local storage device 824 tothe remote storage device 826. The remote storage device 826 receiving acommit message for a particular sequence number will indicate to theremote storage device 826 that the data corresponding to the sequencenumber has all been sent.

Following the step 1058 is a step 1062 where copying of data for theinactive one of the lists 874, 876 is suspended. As discussed elsewhereherein, the inactive one of the lists is scanned to send correspondingdata from the local storage device 824 to the remote storage device 826.It is useful to suspend copying data until the sequence number switch iscompleted.

Following step 1062 is a step 1064 where the sequence number isincremented. Following step 1064 is a test step 1072 which determines ifthe remote storage device 826 has acknowledged the commit message sentat the step 1058. Once it is determined that the remote storage device826 has acknowledged the commit message sent at the step 1058, controltransfers from the step 1072 to a step 1074 where the suspension ofcopying, which was provided at the step 1062, is cleared so that copyingmay resume. Following step 1074, processing is complete. Note that it ispossible to go from the step 1074 back to the step 1052 to begin a newcycle to continuously increment the sequence number.

It is also possible to use tables at the local storage device 824 tocollect slots associated with active data and inactive chunks of data.In that case, one table could be associated with the inactive sequencenumber and another table could be associated with the active sequencenumber. This is described below.

Referring to FIG. 34, a diagram 1200 illustrates items used to constructand maintain the chunks 852, 854. A standard logical device 1202contains data written by the host 822 and corresponds to the dataelement 851, discussed above. The standard logical device 1202 containsdata written by the host 822 to the local storage device 824.

Two tables 1204, 1206 are used in connection with the standard logicaldevice 1202. The tables 404, 406 may correspond to device tables thatmay be stored, for example, in the memory of the local storage device824. Each track entry of the tables 1204, 1206 point to either a trackof the standard logical device 1202 or point to a slot of a cache 1208used in connection with the local storage device 824.

The cache 1208 contains a plurality of cache slots 1212-1214 that may beused in connection to writes to the standard logical device 1202 and, atthe same time, used in connection with the tables 1204, 1206. In anembodiment herein, each track table entry of the tables 1204, 1206contains a null to indicate use of a corresponding track of the standardlogical device 1202. Otherwise, an entry in the track table for each ofthe tables 1204, 1206 contains a pointer to one of the slots 1212-1214in the cache 1208.

Each of the cache tables 1204, 1206 may be used for one of the chunks ofdata 852, 854 so that, for example, the table 1204 may correspond to thechunk of data 852 for sequence number N while the table 1206 maycorrespond to the chunk of data 854 for sequence number N−1. Thus, whendata is written by the host 822 to the local storage device 824, thedata is provided to the cache 1208 and an appropriate pointer of thetable 1204 is adjusted. Note that the data will not be removed from thecache 1208 until the data is destaged to the standard logical device1202 and the data is also released by a mechanism associated with thetable 1204, as described elsewhere herein.

In an embodiment herein, one of the tables 1204, 1206 is deemed “active”while the other is deemed “inactive”. Thus, for example, when thesequence number N is even, the table 1204 may be active while the table1206 is inactive. The active one of the tables 1204, 1206 handles writesfrom the host 822 while the inactive one of the tables 1204, 1206corresponds to the data that is being transmitted from the local storagedevice 824 to the remote storage device 826.

While the data that is written by the host 822 is accumulated using theactive one of the tables 1204, 1206 (for the sequence number N), thedata corresponding to the inactive one of the tables 1204, 1206 (forprevious sequence number N−1) is transmitted from the local storagedevice 824 to the remote storage device 826.

Once the data has been transmitted to the remote storage device 826, thecorresponding entry in the inactive one of the tables 1204, 1206 may beset to null. In addition, the data may also be removed from the cache1208 (i.e., the slot returned to the pool of slots for later use) if thedata in the slot is not otherwise needed for another purpose (e.g., tobe destaged to the standard logical device 1202). A mechanism may beused to ensure that data is not removed from the cache 1208 until allmirrors and the tables 1204, 1206 are no longer using the data. Such amechanism is described, for example, in U.S. Pat. No. 5,537,568 issuedon Jul. 16, 1996.

Referring to FIG. 35, a flow chart 1240 illustrates steps performed bythe local storage device 824 in connection with a host 822 performing awrite operation for embodiments where two tables are used. Processingbegins at a first step 1242 where a slot corresponding to the write islocked. In an embodiment herein, each of the slots 1212-1214 of thecache 1208 corresponds to a track of data on the standard logical device1202. Locking the slot at the step 1242 prevents additional processesfrom operating on the relevant slot during the processing performed bythe local storage device 824 corresponding to the steps of the flowchart 1240.

Following the step 1242 is a step 1244 where a value for N, the sequencenumber, is set. Just as with the embodiment that uses lists rather thantables, the value for the sequence number obtained at the step 1244 ismaintained during the entire write operation while the slot is locked.As discussed elsewhere herein, the sequence number is assigned to eachwrite to determine the one of the chunks of data 852, 854 to which thewrite belongs. Writes performed by the host 822 are assigned the currentsequence number. It is useful that a single write operation maintain thesame sequence number throughout.

Following the step 1244 is a test step 1246, which determines if theinactive one of the tables 1204, 1206 already points to the slot thatwas locked at the step 1242 (the slot being operated upon). This mayoccur if a write to the same slot was provided when the sequence numberwas one less than the current sequence number. The data corresponding tothe write for the previous sequence number may not yet have beentransmitted to the remote storage device 826.

If it is determined at the test step 1246 that the inactive one of thetables 1204, 1206 does not point to the slot, then control transfersfrom the test step 1246 to another test step 1248, where it isdetermined if the active one of the tables 1204, 1206 points to theslot. It is possible for the active one of the tables 1204, 1206 topoint to the slot if there had been a previous write to the slot whilethe sequence number was the same as the current sequence number. If itis determined at the test step 1248 that the active one of the tables1204, 1206 does not point to the slot, then control transfers from thetest step 1248 to a step 1252 where a new slot is obtained for the data.Following the step 1252 is a step 1254 where the active one of thetables 1204, 1206 is made to point to the slot.

Following the step 1254, or following the step 1248 if the active one ofthe tables 1204, 1206 points to the slot, is a step 1256 where flags areset. At the step 1256, the write pending flag is set to indicate thatthe slot needs to be transmitted to the remote storage device 826. Inaddition, at the step 1256, an IN_CACHE flag is set to indicate that theslot needs to be destaged to the standard logical device 1202. Notethat, in some instances, if the active one of the tables 1204, 1206already points to the slot (as determined at the step 1248) it ispossible that the write pending and IN_CACHE flags were already setprior to execution of the step 1256. However, setting the flags at thestep 1256 ensures that the flags are set properly no matter what theprevious state.

Following the step 1256 is a step 1258 where an indirect flag in thetrack table that points to the slot is cleared, indicating that therelevant data is provided in the slot and not in a different slotindirectly pointed to. Following the step 1258 is a step 1262 where thedata being written by the host 822 is written to the slot. Following thestep 1262 is a step 1264 where the slot is unlocked. Following step1264, processing is complete.

If it is determined at the test step 1246 that the inactive one of thetables 1204, 1206 points to the slot, then control transfers from thestep 1246 to a step 1272, where a new slot is obtained. The new slotobtained at the step 1272 may be used for the inactive one of the tables1204, 1206 to effect the transfer while the old slot may be associatedwith the active one of the tables 1204, 1206, as described below.

Following the step 1272 is a step 1274 where the data from the old slotis copied to the new slot that was obtained at the step 1272. Followingthe step 1274 is a step 1276 where the indirect flag (discussed above)is set to indicate that the track table entry for the inactive one ofthe tables 1204, 1206 points to the old slot but that the data is in thenew slot which is pointed to by the old slot. Thus, setting indirectflag at the step 1276 affects the track table of the inactive one of thetables 1204, 1206 to cause the track table entry to indicate that thedata is in the new slot.

Following the step 1276 is a step 1278 where the mirror bits for therecords in the new slot are adjusted. Any local mirror bits that werecopied when the data was copied from the old slot to the new slot at thestep 1274 are cleared since the purpose of the new slot is to simplyeffect the transfer for the inactive one of the tables. The old slotwill be used to handle any local mirrors. Following the step 1278 is thestep 1262 where the data is written to the slot. Following step 1262 isthe step 1264 where the slot is unlocked. Following the step 1264,processing is complete.

Referring to FIG. 36, a flow chart 1300 illustrates steps performed inconnection with the local storage device 824 transmitting the chunk ofdata 854 to the remote storage device 826 when the data has beenaccumulated according to the embodiment illustrated in connection withFIG. 34. The transmission essentially involves scanning the inactive oneof the tables 1204, 1206 for tracks that have been written theretoduring a previous iteration when the inactive one of the tables 1204,1206 was active.

Processing begins at a first step 1302 where the first track of theinactive one of the tables 1204, 1206 is pointed to in order to beginthe process of iterating through all of the tracks. Following the firststep 1302 is a test step 1304 where it is determined if the writepending flag is set. As discussed elsewhere herein, the write pendingflag is used to indicate that a slot (track) contains data that needs tobe transmitted to the remote storage device 826. The write pending flagbeing set indicates that at least some data for the slot (track) is tobe transmitted. In an embodiment herein, the entire slot is nottransmitted. Rather, only records within the slot that have theappropriate mirror bits set (indicating the records have changed) aretransmitted to the remote storage device 826. However, in otherembodiments, it may be possible to transmit the entire slot, providedthat the remote storage device 826 only writes data corresponding torecords having appropriate mirror bits set and ignores other data forthe track, which may or may not be valid.

If it is determined at the test step 1304 that the cache slot beingprocessed has the write pending flag set, then control transfers fromthe step 1304 to a test step 1305, where it is determined if the slotcontains the data or if the slot is an indirect slot that points toanother slot that contains the relevant data. In some instances, a slotmay not contain the data for the portion of the disk that corresponds tothe slot. Instead, the slot may be an indirect slot that points toanother slot that contains the data. If it is determined at the step1305 that the slot is an indirect slot, then control transfers from thestep 1305 to a step 1306, where the data (from the slot pointed to bythe indirect slot) is obtained. Thus, if the slot is a direct slot, thedata being sent is stored in the slot while if the slot is an indirectslot, the data being sent is in another slot pointed to by the indirectslot.

Following the step 1306 or the step 1305 if the slot is a direct slot isa step 1307 where data being sent (directly or indirectly from the slot)is transmitted to the remote storage device 826. Following the step 1307is a test step 1308 where it is determined if the remote storage device826 has acknowledged receipt of the data. If not, then control transfersfrom the step 1308 back to the step 1307 to resend the data. In otherembodiments, different and more involved processing may used to senddata and acknowledge receipt thereof. Such processing may include errorreporting and alternative processing that is performed after a certainnumber of attempts to send the data have failed.

Once it is determined at the test step 1308 that the data has beensuccessfully sent, control passes from the step 1308 to a step 1312 toclear the write pending flag (since the data has been successfullysent). Following the step 1312 is a step 1314 where appropriate mirrorflags are cleared to indicate that at least the remote storage device826 no longer needs the data. In an embodiment herein, each record thatis part of a slot (track) has individual mirror flags indicating whichmirrors use the particular record. The remote storage device 826 is oneof the mirrors for each of the records and it is the flags correspondingto the remote storage device 826 that are cleared at the step 1314.

Following the step 1314 is a test step 1316 which determines if any ofthe records of the track being processed have any other mirror flags set(for other mirror devices). If not, then control passes from the step1316 to a step 1318 where the slot is released (i.e., no longer beingused). In some embodiments, unused slots are maintained in a pool ofslots available for use. Note that if additional flags are still set forsome of the records of the slot, it may mean that the records need to bedestaged to the standard logical device 1202 or are being used by someother mirror. Following the step 1318, or following the step 1316 ifmore mirror flags are present, is a step 1322 where the pointer that isused to iterate through each track entry of the inactive one of thetables 1204, 1206 is made to point to the next track. Following the step1322 is a test step 1324 which determines if there are more tracks ofthe inactive one of the tables 1204, 1206 to be processed. If not, thenprocessing is complete. Otherwise, control transfers back to the teststep 1304, discussed above. Note that the step 1322 is also reached fromthe test step 1304 if it is determined that the write pending flag isnot set for the track being processed.

Referring to FIG. 37, a diagram 1500 illustrates a host 1502 coupled toa plurality of local storage devices 1503-1505. The diagram 1500 alsoshows a plurality of remote storage devices 1506-1508. Although onlythree local storage devices 1503-1505 and three remote storage devices1506-1508 are shown in the diagram 1500, the system described herein maybe expanded to use any number of local and remote storage devices. Asdiscussed in more detail below, the functionality associated withproviding continuous backup at a single remote storage device couple toa single local storage device may be extended to operate with multiplelocal and remote storage devices.

Each of the local storage devices 1503-1505 is coupled to acorresponding one of the remote storage devices 1506-1508 so that, forexample, the local storage device 1503 is coupled to the remote storagedevice 1506, the local storage device 1504 is coupled to the remotestorage device 1507 and the local storage device 1505 is coupled to theremote storage device 1508. The local storage devices 1503-1505 maytransfer data for remote continuous backup to the remote storage devices1506-1508 so that, for example, the local storage device 1503 maytransfer remote continuous backup data to the remote storage device1506.

In some instances, the host 1502 may run a single application thatsimultaneously uses more than one of the local storage devices1503-1505. In such a case, the application may be configured to ensurethat application data is consistent (recoverable) at the local storagedevices 1503-1505 if the host 1502 were to cease working at any timeand/or if one of the local storage devices 1503-1505 were to fail.However, since each of the connections between the local storage devices1503-1505 and the remote storage devices 1506-1508 may be asynchronousfrom the other connections, then there may be no assurance that data forthe application will be consistent (and thus recoverable) at the remotestorage devices 1506-1508. That is, for example, even though the dataconnection between the local storage device 1503 and the remote storagedevice 1506 (a first local/remote pair) is consistent and the dataconnection between the local storage device 1504 and the remote storagedevice 1507 (a second local/remote pair) is consistent, it is notnecessarily the case that the data on the remote storage devices 1506,1507 is always consistent if there is no synchronization between thefirst and second local/remote pairs.

For applications on the host 1502 that simultaneously use a plurality oflocal storage devices 1503-1505, it is desirable to have the data beconsistent and recoverable at the remote storage devices 1506-1508. Thismay be provided by a mechanism whereby the host 1502 controls cycleswitching at each of the local storage devices 1503-1505 so that thedata from the application running on the host 1502 is consistent andrecoverable at the remote storage devices 1506-1508. This functionalityis provided by a special application that runs on the host 1502 thatswitches a plurality of the local storage devices 1503-1505 intomulti-box mode, as described in more detail below.

Referring to FIG. 38, a table 1530 has a plurality of entries 1532-1534.Each of the entries 1532-1534 correspond to a single local/remote pairof storage devices so that, for example, the entry 1532 may correspondto pair of the local storage device 1503 and the remote storage device1506, the entry 1533 may correspond to pair of the local storage device1504 and the remote storage device 1507 and the entry 1534 maycorrespond to the pair of local storage device 1505 and the remotestorage device 1508. Each of the entries 1532-1534 has a plurality offields where a first field 1536 a-1536 c represents a serial number ofthe corresponding local storage device, a second field 1538 a-1538 crepresents a session number used by the multi-box group, a third field1542 a-1542 c represents the serial number of the corresponding remotestorage device of the local/remote pair, and a fourth field 1544 a-1544c represents the session number for the multi-box group. The table 1530is constructed and maintained by the host 1502 in connection withoperating in multi-box mode. In addition, the table 1530 is propagatedto each of the local storage devices and the remote storage devices thatare part of the multi-box group. The table 1530 may be used tofacilitate recovery, as discussed in more detail below.

Different local/remote pairs may enter and exit multi-box modeindependently in any sequence and at any time. The host 1502 managesentry and exit of local storage device/remote storage device pairs intoand out of multi-box mode. This is described in more detail below.

Referring to FIG. 39, a flow chart 1550 illustrates steps performed bythe host 1502 in connection with entry or exit of a local/remote pair into or out of multi-box mode. Processing begins at a first step 1552where multi-box mode operation is temporarily suspended. Temporarilysuspending multi-box operation at the step 1552 is useful to facilitatethe changes that are made in connection with entry or exit of aremote/local pair in to or out of multi-box mode. Following the step1552, is a step 1554 where a table like the table 1530 is modified toeither add or delete an entry, as appropriate. Following the step 1554is a step 1556 where the modified table is propagated to the localstorage devices and remote storage devices of the multi-box group.Propagating the table at the step 1556 facilitates recovery, asdiscussed in more detail elsewhere herein.

Following the step 1556 is a step 1558 where a message is sent to theaffected local storage device to provide the change. The local storagedevice may configure itself to run in multi-box mode or not, asdescribed in more detail elsewhere herein. As discussed in more detailbelow, a local storage device handling remote continuous backup operatesdifferently depending upon whether it is operating as part of amulti-box group or not. If the local storage device is being added to amulti-box group, the message sent at the step 1558 indicates to thelocal storage device that it is being added to a multi-box group so thatthe local storage device should configure itself to run in multi-boxmode. Alternatively, if a local storage device is being removed from amulti-box group, the message sent at the step 1558 indicates to thelocal storage device that it is being removed from the multi-box groupso that the local storage device should configure itself to not run inmulti-box mode.

Following step 1558 is a test step 1562 where it is determined if alocal/remote pair is being added to the multi-box group (as opposed tobeing removed). If so, then control transfers from the test step 1562 toa step 1564 where tag values are sent to the local storage device thatis being added. The tag values are provided with the data transmittedfrom the local storage device to the remote storage device in a mannersimilar to providing the sequence numbers with the data. The tag valuesare controlled by the host and set so that all of the local/remote pairssend data having the same tag value during the same cycle. Use of thetag values is discussed in more detail below. Following the step 1564,or following the step 1562 if a new local/remote pair is not beingadded, is a step 1566 where multi-box operation is resumed. Followingthe step 1566, processing is complete.

Referring to FIG. 40, a flow chart 1580 illustrates steps performed inconnection with the host managing cycle switching for multiplelocal/remote pairs running as a group in multi-box mode. As discussedelsewhere herein, multi-box mode involves having the host synchronizecycle switches for more than one remote/local pair to maintain dataconsistency among the remote storage devices. Cycle switching iscoordinated by the host rather than being generated internally by thelocal storage devices. This is discussed in more detail below.

Processing for the flow chart 1580 begins at a test step 1582 whichdetermines if M seconds have passed. Just as with non-multi-boxoperation, cycle switches occur no sooner than every M seconds where Mis a number chosen to optimize various performance parameters. As thenumber M is increased, the amount of overhead associated with switchingdecreases. However, increasing M also causes the amount of data that maybe potentially lost in connection with a failure to also increase. In anembodiment herein, M is chosen to be thirty seconds, although, obviouslyother values for M may be used.

If it is determined at the test step 1582 that M seconds have notpassed, then control transfers back to the step 1582 to continue pollinguntil M seconds have passed. Once it is determined at the test step 1582that M seconds have passed, control transfers from the step 1582 to astep 1584 where the host queries all of the local storage devices in themulti-box group to determine if all of the local/remote pairs are readyto switch. The local/remote pairs being ready to switch is discussed inmore detail hereinafter.

Following the step 1584 is a test step 1586 which determines if all ofthe local/remote pairs are ready to switch. If not, control transfersback to the step 1584 to resume the query. In an embodiment herein, itis only necessary to query local/remote pairs that were previously notready to switch since, once a local/remote pair is ready to switch, thepair remains so until the switch occurs.

Once it is determined at the test step 1586 that all of the local/remotepairs in the multi-box group are ready to switch, control transfers fromthe step 1586 to a step 1588 where an index variable, N, is set equal toone. The index variable N is used to iterate through all thelocal/remote pairs (i.e., all of the entries 1532-1534 of the table1530). Following the step 1588 is a test step 1592 which determines ifthe index variable, N, is greater than the number of local/remote pairsin the multi-box group. If not, then control transfers from the step1592 to a step 1594 where an open window is performed for the Nth localstorage device of the Nth pair by the host sending a command (e.g., anappropriate system command) to the Nth local storage device. Opening thewindow for the Nth local storage device at the step 1594 causes the Nthlocal storage device to suspend writes so that any write by a host thatis not begun prior to opening the window at the step 1594 will not becompleted until the window is closed (described below). Not completing awrite operation prevents a second dependant write from occurring priorto completion of the cycle switch. Any writes in progress that werebegun before opening the window may complete prior to the window beingclosed.

Following the step 1594 is a step 1596 where a cycle switch is performedfor the Nth local storage device. Performing the cycle switch at thestep 1596 involves sending a command from the host 1502 to the Nth localstorage device. Processing the command from the host by the Nth localstorage device is discussed in more detail below. Part of the processingperformed at the step 1596 may include having the host provide newvalues for the tags that are assigned to the data. The tags arediscussed in more detail elsewhere herein. In an alternative embodiment,the operations performed at the steps 1594, 1596 may be performed as asingle integrated step 1597, which is illustrated by the box drawnaround the steps 1594, 1596.

Following the step 1596 is a step 1598 where the index variable, N, isincremented. Following step 1598, control transfers back to the teststep 1592 to determine if the index variable, N, is greater than thenumber of local/remote pairs.

If it is determined at the test step 1592 that the index variable, N, isgreater than the number of local/remote pairs, then control transfersfrom the test step 1592 to a step 1602 where the index variable, N, isset equal to one. Following the step 1602 is a test step 1604 whichdetermines if the index variable, N, is greater than the number oflocal/remote pairs. If not, then control transfers from the step 1604 toa step 1606 where the window for the Nth local storage device is closed.Closing the window of the step 1606 is performed by the host sending acommand to the Nth local storage device to cause the Nth local storagedevice to resume write operations. Thus, any writes in process that weresuspended by opening the window at the step 1594 may now be completedafter execution of the step 1606. Following the step 1606, controltransfers to a step 1608 where the index variable, N, is incremented.Following the step 1608, control transfers back to the test step 1604 todetermine if the index variable, N, is greater than the number oflocal/remote pairs. If so, then control transfers from the test step1604 back to the step 1582 to begin processing for the next cycleswitch.

Referring to FIG. 41, a flow chart 1630 illustrates steps performed by alocal storage device in connection with cycle switching. The flow chart1630 of FIG. 41 replaces the flow chart 1050 of FIG. 33 in instanceswhere the local storage device supports both multi-box mode andnon-multi-box mode. That is, the flow chart 1630 shows steps performedlike those of the flow chart 1050 of FIG. 33 to support non-multi-boxmode and, in addition, includes steps for supporting multi-box mode.

Processing begins at a first test step 1632 which determines if thelocal storage device is operating in multi-box mode. Note that the flowchart 1550 of FIG. 39 shows the step 1558 where the host sends a messageto the local storage device. The message sent at the step 1558 indicatesto the local storage device whether the local storage device is inmulti-box mode or not. Upon receipt of the message sent by the host atthe step 1558, the local storage device sets an internal variable toindicate whether the local storage device is operating in multi-box modeor not. The internal variable may be examined at the test step 1632.

If it is determined at the test step 1632 that the local storage deviceis not in multi-box mode, then control transfers from the test step 1632to a step 1634 to wait M seconds for the cycle switch. If the localstorage device is not operating in multi-box mode, then the localstorage device controls its own cycle switching and thus executes thestep 1634 to wait M seconds before initiating the next cycle switch.

Following the step 1634, or following the step 1632 if the local storagedevice is in multi-box mode, is a test step 1636 which determines if allI/O's for a previous sequence number have completed. Once it isdetermined at the test step 1636 that all I/O's for a previous cyclenumber have completed, control transfers from the test step 1636 to astep 1688 which determines if the inactive chunk for the local storagedevice is empty. Once it is determined at the test step 1688 that theinactive chunk is empty, control transfers from the step 1688 to a step1689, where copying of data from the local storage device to the remotestorage device is suspended. It is useful to suspend copying data untilthe sequence number switch is complete.

Following the step 1689 is a test step 1692 to determine if the localstorage device is in multi-box mode. If it is determined at the teststep 1692 that the local storage device is in multi-box mode, thencontrol transfers from the test step 1692 to a test step 1694 todetermine if the active chunk of the corresponding remote storage deviceis empty. The remote storage device sends a message to the local storagedevice once it has emptied its active chunk. In response to the message,the local storage device sets an internal variable that is examined atthe test step 1694.

Once it is determined at the test step 1694 that the active chunk of theremote storage device is empty, control transfers from the test step1694 to a step 1696 where an internal variable is set on a local storagedevice indicating that the local storage device is ready to switchcycles. As discussed above in connection with the flow chart 1580, thehost queries each of the local storage devices to determine if each ofthe local storage devices are ready to switch. In response to the queryprovided by the host, the local storage device examines the internalvariable set at the step 1696 and returns the result to the host.

Following step 1696 is a test step 1698 where the local storage devicewaits to receive the command from the host to perform the cycle switch.As discussed above in connection with the flow chart 1580, the hostprovides a command to switch cycles to the local storage device when thelocal storage device is operating in multi-box mode. Thus, the localstorage device waits for the command at the step 1698, which is onlyreached when the local storage device is operating in multi-box mode.

Once the local storage device has received the switch command from thehost, control transfers from the step 1698 to a step 1702 to send acommit message to the remote storage device. Note that the step 1702 isalso reached from the test step 1692 if it is determined at the steptest 1692 that the local storage device is not in multi-box mode. At thestep 1702, the local storage device sends a commit message to the remotestorage device. In response to receiving a commit message for aparticular sequence number, the remote storage device will begin storingthe data according to the continuous backup functionality discussedherein.

Following the step 1702 is a step 1706 where the sequence number isincremented and a new value for the tag (from the host) is stored. Thesequence number is as discussed above. The tag is the tag provided tothe local storage device at the step 1564 and at the step 1596, asdiscussed above. The tag is used to facilitate data recovery, asdiscussed elsewhere herein.

Following the step 1706 is a step 1708 where completion of the cycleswitch is confirmed from the local storage device to the host by sendinga message from the local storage device to the host. In someembodiments, it is possible to condition performing the step 1708 onwhether the local storage device is in multi-box mode or not, since, ifthe local storage device is not in multi-box mode, the host is notnecessarily interested in when cycle switches occur.

Following the step 1708 is a test step 1712 which determines if theremote storage device has acknowledged the commit message. Note that ifthe local/remote pair is operating in multi-box mode and the remotestorage device active chunk was determined to be empty at the step 1694,then the remote storage device should acknowledge the commit messagenearly immediately since the remote storage device will be ready for thecycle switch immediately because the active chunk thereof is alreadyempty.

Once it is determined at the test step 1712 that the commit message hasbeen acknowledged by the remote storage device, control transfers fromthe step 1712 to a step 1714 where the suspension of copying, which wasprovided at the step 1689, is cleared so that copying from the localstorage device to the remote storage device may resume. Following thestep 1714, processing is complete.

Referring to FIG. 42, a flow chart 1740 illustrates steps performed inconnection with scanning the inactive buffers of the local storagedevice 824 to transmit data from the local storage device 824 to theremote storage device 826 when the data has been accumulated accordingto the embodiment illustrated in connection with FIG. 29. The flow chart1740 is similar to the flow chart 1000 of FIG. 32 and similar steps aregiven the same reference number. However, the flow chart 1740 includestwo additional steps 1742, 1744 which are not found in the flow chart1000 of FIG. 32. The additional steps 1742, 1744 are used to facilitatemulti-box processing. After data has been sent at the step 1012, controltransfers from the step 1012 to a test step 1742 which determines if thedata being sent is the last data in the inactive chunk of the localstorage device. If not, then control transfers from the step 1742 to thestep 1014 and processing continues as discussed above in connection withthe flow chart 1000 of FIG. 32. Otherwise, if it is determined at thetest step 1742 that the data being sent is the last data of the chunk,then control transfers from the step 1742 to the step 1744 to send aspecial message from the local storage device 824 to the remote storagedevice 826 indicating that the last data has been sent. Following thestep 1744, control transfers to the step 1014 and processing continuesas discussed above in connection with the flow chart 1000 of FIG. 32. Insome embodiments, the steps 1742, 1744 may be performed by a separateprocess (and/or separate hardware device) that is different from theprocess and/or hardware device that transfers the data.

Referring to FIG. 43, a flow chart 1750 illustrates steps performed inconnection with the local storage device 824 scanning the inactivebuffers to transmit data from the local storage device 824 to the remotestorage device 826 when the data has been accumulated according to theembodiment illustrated in connection with FIG. 34. The flow chart 1750of FIG. 43 is similar to the flow chart 1300 of FIG. 36 and similarsteps are given the same reference number. However, the flow chart 1750includes an additional step 1752, which is not found in the flow chart1300 of FIG. 36. The additional steps 1.752 is used to facilitatemulti-box processing and is like the additional step 1744 of the flowchart 1740 of FIG. 42. After it is determined at the test step 1324 thatno more slots remain to be sent from the local storage device to theremote storage device, control transfers from the step 1324 to the step1752 to send a special message from the local storage device 824 to theremote storage device 826 indicating that the last data for the chunkhas been sent. Following the step 1752, processing is complete.

As mentioned elsewhere herein, continuous backup may be provided at theremote storage device 826 for data from the local storage device 824 byhaving the remote storage device 826 store data provided thereto usingtechniques described herein in connection with providing continuousbackup at the same storage device that generates the data (e.g., theflowchart 500 of FIG. 19). However, having the local storage device 824and the remote storage device 826 presents additional possibilities bothfor continuous backup storage, access, and restoration.

Note that it is possible to restore data to a particular point in timeat the remote storage device 826 by rolling back all the data, readingparticular data from the point in time, etc. using the techniquesdescribed above. However, since the host 822 is coupled to the localstorage device 824, then providing the host 822 with access to the pointin time data requires either having the host 822 access the data fromthe remote storage device 826 or transferring the rolled back data fromthe remote storage device 826 to the local storage device 824.

Referring to FIG. 44, a flow chart 1760 illustrates steps performed inconnection with restoring data to a particular point in time (targettime) using the local storage device 824 and the remote storage device826. Processing begins at a first step 1762 where continuous backupprocessing is stopped. Following the step 1762 is a step 1766 where thelocal storage device 824 is made not ready for access by the host 822(or any other device). Following the step 1766 is a step 1768 wheretracks on the local storage device 824 are set to invalid in instanceswhere a corresponding track of the CB Virtual device used on the remotestorage device 826 points to a log device. Setting particular tracks toinvalid at the step 1768 causes reads by the host 822 (or any otherdevice reading data at the local storage device 824) to obtain the datafor those tracks from the remote storage device 826.

Following the step 1768 is a step 1772 where the data is restored to thetarget time at the remote storage device 826. Processing performed atthe step 1772 may include any of the techniques described elsewhereherein. Following the step 1772 is a step 1774 where the local storagedevice 824 is made ready for access by the host 822 (or other similardevices). Following the step 1774 is a step 1776 where the continuousbackup process is restarted. Following the step 1776, processing iscomplete.

Referring to FIG. 45, a diagram 1780 illustrates an embodiment where avirtual device 1782 is provided at the local storage device 824 toprovide access to a CB virtual device 1784 at the remote storage device826. In the embodiment illustrated by the diagram 1780, the host 822 mayaccess the CB virtual device 1784 by locally accessing the virtualdevice 1782. Reads and writes from and to the virtual device 1782 causecorresponding reads and writes from and to the CB virtual device 1784via the data link between the local storage device 824 and the remotestorage device 826. Thus, it is possible to use the virtual device 1782to perform the processing illustrated elsewhere herein, such as readingdata from a particular point in time illustrated by the flow chart 570of FIG. 21. Coupling the devices 1782, 1784 may be by any appropriatetechnique, including conventional mirroring techniques.

Referring to FIG. 46, a diagram 1800 illustrates an alternativeembodiment having a CB virtual device 1802, a standard logical device1804, a log device 1806 and an I/O module 1808, all of which operategenerally as described above in connection with providing continuousbackup at the same storage device to which the host is providing directI/O operations (FIG. 16) or providing I/O operations at a differentdevice than the device to which the host is providing direct I/Ooperations (FIG. 27). The diagram 1800 also shows a mirror logicaldevice 1804′ which provides a local mirror of the standard logicaldevice 1804. The mirror logical device 1804′ may be implemented in aconventional fashion and may have the capability to be split from thestandard logical device 1804 so that mirror functionality ceases and themirror logical device 1804′ may be accessed for I/O operations separatefrom the standard logical device 1804 after the split.

The mirror logical device 1804′ may be used for a number of purposes.For example, the mirror logical device 1804′ may eliminate the need toallocate space for an entire track and copy an entire track on a firstwrite by splitting the mirror logical device 1804′ at the initiation ofthe continuous backup. Thus, for example, the steps 502, 524, 526 of theflow chart 500 of FIG. 19 may be eliminated and all of the otherprocesses that would otherwise obtain data from the base track of thelog device 1806 would instead obtain that data from the mirror logicaldevice 1804′. This avoids some of the overhead associated with the firstwrite to a track. Alternatively, the embodiments described above inconnection with FIG. 16 and FIG. 27 may be implemented as described,except that the initial first write the entire track to be copied to thelog device may be done as a background task by copying data the entiretrack from the mirror logical device 1804′ instead of the standardlogical device 1804.

Although the system described herein uses tracks as a unit of data forcertain purposes, it should be understood that other units of data(including, possibly, variable length units of data) may be used. Thisalso applies to other data structures and data units. In addition, insome instances, the order of steps in the flow charts may be modified,where appropriate.

In an embodiment herein, the timer it may be used to keep track of theactual passage of time (e.g., wall time). For example, the timer mayrepresent the number of seconds (or milliseconds, minutes, hours, etc.)since the system was initialized. Alternatively, the timer it mayrepresent the actual time of day in combination with the date. Incontrast, the counter may be used to increment through states that aredifferentiated without necessarily any correlation to actual time. Forexample, the counter may be incremented on every write to the system,every N write, or according to some other metric. In some embodiments,the counter may be a function (at least partially) of the value of thetimer.

In some embodiments, it may be possible to provide a mechanism toconsolidate data changes in a way that decreases the storagerequirements while decreasing the granularity. Data may be combined bymerging data from consecutive (in time) elements stored on one or morelog devices for a particular track or data segment. For instance, in maybe possible to combine all of the changes corresponding to a single dayinto one element even though the original granularity used when the datawas collected was less than a day (e.g., a granularity of one minute).The trade off is that combining multiple consecutive elements into asingle element saves storage space, but reduces recovery granularity.However, a reduction in granularity may be acceptable in certaininstances, such as after some time has passed. For example, it may beuseful to initially provide continuous backup for a particular day withan initial fine granularity (e.g., one second), but then, after a firstamount of time has passed (e.g., one day) reducing the granularity (andstorage requirements) to provide a mid level granularity (e.g., oneminute). After a second amount of time has passed (e.g., another day),the granularity (and storage requirements) may be reduced further (e.g.,one hour), and so on.

While the invention has been disclosed in connection with variousembodiments, modifications thereon will be readily apparent to thoseskilled in the art. Accordingly, the spirit and scope of the inventionis set forth in the following claims.

1. A method of providing continuous backup of a storage device,comprising: subdividing the storage device into subsections; providing amirror device of the storage device that contains a copy of data that ison the storage device when the continuous backup is initiated; providinga time indicator that is modified periodically; and in response to arequest to write new data to a particular subsection of the storagedevice at a particular time, maintaining data being overwritten by thenew data according to the particular subsection and according to a valueof the indicator at the particular time, wherein, for a first writeafter the continuous backup is initiated, data from the mirror device isused to maintain data being overwritten.
 2. A method, according to claim1, wherein the subsections are tracks.
 3. A method, according to claim1, wherein maintaining the data being overwritten includes constructinga linked list of portions of data for each of the subsections.
 4. Amethod, according to claim 3, wherein the portions of data have variablesizes.
 5. A method, according to claim 1, wherein in response to twodata write operations to a particular subsection at a particular valueof the indicator, data being written for each of the two data writeoperations is combined if data for the second data write operation is asubset of data for the first data write operation.
 6. A method,according to claim 1, further comprising: restoring the storage deviceto a state thereof at a particular point in time by writing themaintained data to the storage device.
 7. A method, according to claim6, wherein writing the maintained data to the storage device includesconstructing subsections of the data by combining separate portionsthereof corresponding to the same subsection.
 8. A method, according toclaim 1, further comprising: inserting data for a particular subsectionat a particular point in time by traversing data corresponding to theparticular subsection to obtain an appropriate insertion point.
 9. Amethod, according to claim 1, further comprising: reading data for aparticular subsection at a particular point in time by traversing datacorresponding to the particular subsection and reading from the groupconsisting of: data from the storage device, maintained data, and acombination of maintained data and data from the storage device.
 10. Amethod, according to claim 1, further comprising: compressing data bycombining consecutive portions for a subsection.
 11. Computer software,in a storage medium, that provides continuous backup of a storagedevice, comprising: executable code that obtains a value of a timeindicator that is modified periodically; and executable code that, inresponse to a request to write new data to a particular subsection ofthe storage device at a particular time, maintains data beingoverwritten by the new data according to the particular subsection andaccording to a value of the indicator at the particular time wherein,for a first write after the continuous backup is initiated, data used tomaintain data being overwritten is from a mirror device of the storagedevice, the mirror device containing a copy of data that is on thestorage device when the continuous backup is initiated.
 12. Computersoftware, according to claim 11, wherein the subsections are tracks. 13.Computer software, according to claim 11, wherein executable code thatmaintains the data being overwritten constructs a linked list ofportions of data for each of the subsections.
 14. Computer software,according to claim 13, wherein the portions of data have variable sizes.15. Computer software, according to claim 11, wherein in response to twodata write operations to a particular subsection at a particular valueof the indicator, data being written for each of the two data writeoperations is combined if data for the second data write operation is asubset of data for the first data write operation.
 16. Computersoftware, according to claim 11, further comprising: executable codethat restores the storage device to a state thereof at a particularpoint in time by writing the maintained data to the storage device. 17.Computer software, according to claim 16, wherein executable code thatwrites the maintained data to the storage device constructs subsectionsof the data by combining separate portions thereof corresponding to thesame subsection.
 18. Computer software, according to claim 11, furthercomprising: executable code that inserts data for a particularsubsection at a particular point in time by traversing datacorresponding to the particular subsection to obtain an appropriateinsertion point.
 19. Computer software, according to claim 11, furthercomprising: executable code that reads data for a particular subsectionat a particular point in time by traversing data corresponding to theparticular subsection and reading from the group consisting of: datafrom the storage device, maintained data, and a combination ofmaintained data and data from the storage device.
 20. Computer software,according to claim 11, further comprising: executable code thatcompresses data by combining consecutive portions for a subsection.